Natural Language Processing Use Case – How Do Personal Assistant Apps Work?

6 minute read | June 10, 2020

Written by:
Sakshi Gupta

The ability to communicate our ideas and thoughts, and interact effectively is what makes us human. How does it feel to think that the ability is no longer your monopoly? Owing to Natural Language processing, now machines around us can also ‘talk’ to us (and each other, but that’s a story for another time). The best part? They interact with us in our language. Precisely, machines can not only understand our language but also reply and perform assigned actions. In this article, we’ll be talking about all about NLP (Natural Language Processing) and how it is the driving force behind personal assistant apps – (read: Alexa, Siri, Cortana, Google Assistant).

First things first – what is natural language processing?

So, here’s the thing – Computer and human interaction is not a novel concept. We have been asking computers to perform actions from generations. The only difference is – then it was a language that they (computers) knew best – the Programming language – that was precise, unambiguous and highly structured.

Now it is what we (humans) know best – Human speech – that is not always precise, is ambiguous and the linguistic structure depends on many variables, including but not limited to slang, regional dialects and social context. It is only now that you can ask Alexa to order a pizza or ask Siri to tell you the weather forecast – all in your language.

NLP is the answer to: How does the latter happen?

Natural language processing is, therefore, the technology used to program computers to understand, process, and generate language in a way we know it. It is the way computers and humans interact with each other. The NLP helps to read, decipher, understand, and make sense of the human language.

Now it is no longer about interpreting a text or speech through mechanical ways based on keywords. Rather it is about understanding the meaning behind those words. That is what makes sentiment analysis possible.

When you’re typing on your smartphone, you’ll see word suggestions based on what you type and what you’re currently typing. That’s natural language processing in action.

Natural language processing and machine learning

Machine learning is an application of artificial intelligence (AI) that provides systems with the ability to automatically learn and improve from experience without being explicitly programmed.

The fact is – language or words in a language are finite. English has approximately 170,000 words. So how about programming computers to understand all these words so that they give us the needed result. Right? Obviously not!

Although English has a finite number of words, these words can be combined in a vast number of ways to express ideas. Accents, slang, and errors present even more ambiguity, so explicitly programming a machine to understand language is impossible. This is where NLP comes into play.

NLP enables machines to understand human language (speech and text) with all its nuances.

This is what makes Machine Learning work efficiently. With NLP, now the machines do not need to be programmed to understand us (human language). They rather learn about our language patterns through experience and usage patterns.

For instance “she is a professor” in French is read as “elle est professeur, So machine learning algorithms predict that “she is a dancer” will also begin with “elle est.” Now, the computer needs to figure out the word for “dancer” in French that is “danseuse”

YouTube video player for ZxR38An5TQE

Use case of NLP

Natural Language Processing is the mainspring behind various technological advancements.

And today we’ll discuss the most interesting of them-

How do personal assistant apps work?

So, we ask our google assistant to tell us the latest updates and it does. We ask Cortana – which song is playing, and it answers. And the day goes by without us being amazed by the meticulous technology that makes it possible for a machine to understand what you’re saying and answer you in your language.

Today, we’ll go behind the scene and understand the basic dynamics.

For starters, it understands your instructions in the language you speak. That makes sure personal assistant apps work through natural language processing.

Get To Know Other Data Science Students

Ginny Zhu

Ginny Zhu

Data Science Intern at Novartis

George Mendoza

George Mendoza

Lead Solutions Manager at Hypergiant

Sam Fisher

Sam Fisher

Data Science Engineer at Stratyfy

How? Let’s see!

Your personal assistant app, say Cortana, Google Assistant, and others need to first translate your words into text. That’s usually done using the Hidden Markov Models system (HMM).
What is HMM? The HMM understands and translates what you have said (using math models) into text that can be processed by the NLP system.
The HMM listens to 10 – to 20-millisecond clips of your speech and looks for phonemes (the smallest unit of speech) to compare with pre-recorded speech.
Then comes the process of understanding the language and the context. The systems here try to break each word down into its part of speech (noun, verb, etc.).
To determine the context of your orders, coded grammar rules that rely on the algorithm are used.
The end result categorizes what is said in many different ways.

Note – the HMM method breaks down text and NLP enables human-to-computer communication. But It is the semantic analysis that allows everything to make sense contextually.

Let’s understand this in-depth by taking the example of the most popular personal assistant apps – Alexa, Siri, Cortana, and Google Assistant

How does Alexa work?

The process starts with Alexa trying to improve the target signal. Why? So that the ambient noise is identified and minimised. Acoustic echo makes sure only the important signal remains.
The next step is – wake word detection. It means the device detects whether the user says one of the words that the device needs in order to turn on. Here, Alexa is a wake word.
Amazon then records the words in your order.
This recording of your entire speech is sent to the cloud (Amazon’s server) to be analysed.
Amazon breaks your orders into various sounds.
It then relies on a database (containing various words’ pronunciation) to assess which words correspond to the individual sounds.
Further, it identifies important words and carries out the corresponding actions.
For example, if Alexa detects words like ‘pizza’ or ‘dessert’, it would open a food app.
In the end, Amazon’s servers send the information to your device and Alexa speaks.

How does Siri work?

Siri works on the following technologies-

Speech Recognition
Natural Language processing

Let’s see how it works-

When your order Siri to perform an action, it records the frequencies and soundwaves from your voice and converts them into a code.
Siri then breaks down the code in context to identify patterns and keywords.
This data gets input into an algorithm and is then matched against thousands of combination of sentences to identify what your phrase means.
The algorithm then determines the context of your sentence through working around various literary expressions.
When Siri understands your request, it begins to assess the tasks that need to be performed.
In the previous step, Siri also determines whether the required information is available within the phone’s data bank or has to be extracted from the world wide web.
At last, Siri is able to answer your question forming complete and cohesive sentences.

How does Google Assistant work?

Google Assistant majorly works on three mechanisms –

Entity Extraction –

Firstly, entities like date, time, place, name, phone number, etc are extracted and assigned actions are performed on them – Making a call, for example. These entities are extracted using the ML algorithm or lexical analysis.

Intent Recognition-

Next, the intent of your request is recognized by a Machine Learning Model. It is the intent that further generates dialogues and determines action.

Dialogue Generation – Finally, the response is generated depending upon the input and the intent of the query.

Note: Each personal assistant app has its own dialogue model.

How does Cortana work?

Cortana is activated by your voice or text (the questions you type).

Its intelligence comes via Bing. Bing has access to the following –

Tellme’s (Parent Company: Microsoft Corporation) Natural Language processing
Satori knowledge repository
Microsoft’s cloud processing power

Cortana learns about you and your patterns and stores the highlights in its Notebook. Assessment and analysis of your request take place through Cortana’s intelligence and the suitable action or answer is delivered to you.

Conclusion

Data science is one of the most fascinating fields of the present world. The stream is fluid, ever-evolving, and full of opportunities. And the right time to leverage is – now!

There’s no better way to learn it than diving right in. Read books, google all you want, listen to podcasts, and explore the subject. And then to become a professional, prove your expertise by earning an e-learning certificate.

Related Read: Data Scientist Job Description

Since you’re here…
Curious about a career in data science? Experiment with our free data science learning path, or join our Data Science Bootcamp, where you’ll get your tuition back if you don’t land a job after graduating. We’re confident because our courses work – check out our student success stories to get inspired.