Sentiment analysis for language assistants – How Alexa & co understand what the users really want to say

When it comes to customer service and interpreting product reviews, it is crucial that speech assistants correctly recognize and interpret the mood and tonality of the speaker

December 8, 2020

5 Min Read

“Alexa – what is the role of pattern recognition in sentiment analysis?“

Okay, maybe not. But then maybe let’s start there – with maybe. Think of all the ways you can say the word “maybe”. You can use it neutrally or to express agreement, or dislike. Based on your tone and facial expression, a human listener can usually interpret your sentiment.

In communication via voice technology, sentiment analysis still has some catching up to do. When it comes to customer service and interpreting product reviews, it is crucial that speech assistants correctly recognize and interpret the mood and tonality of the speaker. With more people communicating every day with voice commands and with the likes of Alexa, Siri, etc., you may have already asked yourself: "what happens to my speech when it is picked up by the microphone?”

The simplified answer is pattern recognition. The spoken word is first digitized – converted into binary language. Individual sounds, words and contexts subsequently lose their meaning – at least for humans. Machines use these speech building blocks to compare them with stored digital models. This comparison takes place on many levels – from simple pattern recognition of numbers, to process a selection in a hotline queue – all the way to the calculation of highly complex semantic networks that are able to recognize relational meanings in longer documents. One example of this kind of pattern recognition can be found in sentiment analysis.

When syntax becomes semantics

Sentiment analysis turns syntax into semantics. The correct combination of individual linguistic units becomes a sentence. Thus, a statement acquires meaning through its tonality, the context, as well as through moods and feelings. High-performance applications, supported by complex machine learning models, capture the context of our spoken and written statements in order to quantify emotions like politeness, vehemence and, of course, the factual content.

Most applications return a fairly simple evaluation, consisting of keywords and a matching probability calculation. This can be immediately algorithmically processed, stored and used for other applications. For this purpose, an emotional state is determined as polarity, such as joy versus anger – as well as the respective probability as a certain value between zero and one. The return value of "Joy = 0.78456", for example, indicates that it is most likely a happy, positive statement of the user.

Sentiment analysis applications are available in different performance levels. The simplest software versions search texts for unique terms – a so-called "bag of words", which can undoubtedly be assigned to an emotional situation. Statements like "today I feel excellent" or "man, is the weather nasty!" are easily quantifiable because of the adjectives they contain. It becomes more complicated when the application has to recognise the entire meaning of longer statements or texts and a tonality that changes within a statement.

For this purpose, semantic networks are used that understand the relationships of individual words to each other. For example, if a user gives the voice command: "I'm looking for accommodation for myself and my 100 chickens", the language technology must recognize that it is not a hotel that is being searched for, but rather a rural property.

A further level of complexity is represented by so-called ontologies, which recognize individual terms as a collection of properties, which in turn are conceptually connected with other terms. For example, the statement: "that was surprising!" would usually be positive when referring to a movie while in the context of the use of a software application it would be rather negative.

Sentiment analysis in the marketing context

The added value and advantages of sentiment analysis for companies are obvious: where people have to read and interpret long, nested or even incorrect texts themselves, software automatically evaluates text resources – or even spoken content.

Such applications save time and money, especially when it comes to social media monitoring or analyzing customer reviews and service feedback, such as: "that movie was the most boring thing I've ever seen. Save the entrance fee," or "My bank put me on hold for hours again; I think I’ll take my business elsewhere!”

Companies often use sentiment analysis for "opinion mining" or opinion analysis. For example, every online retailer or financial service provider wants to know what is being written about them on social media, what their target group wants or what the mood of a consumer is when they call the call center. By using voice interfaces programmed specifically for these contexts, it is possible to convert statements of this kind via speech-to-text in order to evaluate them with sentiment analysis APIs.

Such an application is able to evaluate the emotionality and polarity of a statement such as: "my baby got a rash from conventional nappies, but since switching to a fragrance-free brand, this hasn’t happened!” However, speech assistants usually only "listen" for a few seconds when processing statements. While this makes them perfectly suited for capturing short recommendations or opinions more in-depth analysis, such as with a continuous text, is not possible.

By using sentiment analysis in speech applications, companies can use insights to improve their products or services – a development that takes the benefits of speech technology in marketing to a new level.

With people having fewer in person reactions, what consumers say online will soon be one of the only ways for businesses to gauge public opinion making sentiment analysis increasingly important. Rather than waiting until there is no other option, companies should invest in this tech now so they can continue to reach new audiences and stay ahead of the curve.

About the author: Dan Fitzpatrick is Reply's Practice Leader - "Voice Machine Interfaces" and as Head of Experience Technology he leads the technical team at Triplesense Reply. He is responsible for the high quality of technical solutions and the development and publishing processes.

Keep up with the ever-evolving AI landscape
Unlock exclusive AI content by subscribing to our newsletter!!

You May Also Like