How enterprises can derive value through voice innovation

by Max Smolaks
Article Image

by Ian Firth Speechmatics 19 February 2020

Automatic Speech Recognition (ASR) used to be limited to science fiction. You may recall Star Trek’s Captain Picard interrogating his ship’s computer via voice interface. Thirty years on from The Next Generation, we discover speech recognition technology deeply integrated into our lives both at home and at work.

In fact, the speech recognition technology market is in a huge growth cycle, expected to be valued at a staggering $USD4.1 billion in 2024.

The ASR industry has come a long way since commercially available speech-to-text APIs became available. These systems are no longer relegated to relatively straightforward executions, such as the automatic subtitling of TV or film content, or turning on your lights at home. Complex integrations are deriving insight and analysis of voice data across a range of diverse industry sectors such as the financial services industry, healthcare, retail, and media and entertainment. Vendors can extract meaning from data that was previously unavailable – not only voice triggers, but entire conversations, languages and nuances across multiple speakers. 

ASR technology is being used to offer real-time transcription, and extensively in the call center industry. Accurate speech recognition technology provides the call center with analysis of customer conversations to help deliver insight into purchasing habits, and gauge emotional sentiment throughout the customer service journey. It can also be used to prevent mis-selling, and ensure regulatory compliance by capturing and alerting organizations to the presence of potentially sensitive information during customer calls.

On a broader level, the technology enables a full-scale overhaul of customer experience, allowing call center teams, for example, to far more quickly understand what customers are saying and drive precise actions from there. It also aids the hearing-impaired and situationally-disadvantaged.

In the case of ASR, businesses should aspire to create enterprise applications that use voice data in real time, identifying context, punctuation and dialect with hundreds of languages, on-demand, worldwide. This not only creates more accurate and efficient results than human teams, but also maximizes revenue by reducing overhead and providing direct, tangible results to senior and board-level executives.

The analysis of single streams of data (text, video or audio) will soon no longer be enough across the enterprise. In the coming months and years, an increasing number of vertical industries will start harnessing the ‘full signal’ of all three, streamlining processes, empowering workers, and ultimately amplifying the ability for organizations to scale and reallocate resources to more profit-building, strategic pursuits. By harnessing the boundless potential of machine learning, which ‘learns as you learn’, ASR providers can transform the efficiency, profitability of corporate services, optimizing existing workflows and opening the door to new, innovative forms of understanding their customers.


Ian Firth is VP of Products at Speechmatics, a British company which develops automatic speech recognition software based on recurrent neural networks and statistical language modeling.

Practitioner Portal - for AI practitioners

Story

Hesai and Scale AI open-source LiDAR data set for autonomous car training

6/2/2020

Scale claims this is the first time such data has been released with zero restrictions

Story

IBM adds free AI training data sets to Data Asset eXchange

5/28/2020

Big Blue has something for you

Practitioner Portal

EBooks

More EBooks

Upcoming Webinars

More Webinars

Experts in AI

Partner Perspectives

content from our sponsors

Research Reports

9/30/2019
More Research Reports

Infographics

Understanding the advantages of AI chatbots over rule-based chatbots

Infographics archive

Newsletter Sign Up


Sign Up