Belitsoft > Custom Software Development > Custom eLearning Development > Speech Technologies in eLearning: Where to Use Them and How?

Speech Technologies in eLearning: Where to Use Them and How?

The future of eLearning seems to favor personalized and unique experiences, automation, and data-based solutions. Success in this domain lies in effectiveness and technologies like speech recognition that enable an electronic device to understand and analyze spoken word.


In this article we will find out how companies use speech technologies in eLearning, and highlight the pros and cons of applying these tools now. Let’s jump right into it.

Language Learning

Perhaps the greatest obstacle for language learners is to find someone to speak with. Fortunately, technology provides a solution for this problem. Automatic speech recognition (ASR) is the most essential tool for effective language learning.

Services like Amazon Polly for example turn text into realistic speech, making applications talk to users. To put it another way it’s a Text-to-Speech (TTS) utility that uses advanced deep learning technologies to synthesize humanlike speech. It's no wonder Duolingo uses this service for teaching a language through their app.

Next step is to ensure the words and phrases you used are accurate. And if you’re into language apps, you have probably heard of Rosetta Stone. Their program aims to help you perfect your accent by rehearsing normal words and phrases and reading short stories out loud. Additionally it compares your speech to that of a native speaker to give you an immediate assessment of which words you pronounced correctly and which could use more effort.

elearning speech technologies language Source:

You can even compare the wavelengths of your voice to that of the native speaker, for precise adjustments. Also, Rosetta Stone will track your progress and will notify you how your pronunciation has improved over time. The app supports more than 20 languages, including Chinese, Japanese, Korean, Portuguese and Arabic.

Teaching Children

Reading is one of the most essential skills for humans to acquire. But there are some concerns about mostly illustrated nature of children’s mobile apps that could interfere with early development of that ability. There is, of course, no lack of reading apps for young ones but Area 120 has something special to make it better with Google Rivet.

The program applies speech recognition and AI to help kids learn to read. Children often do it by speaking out loud, and Rivet uses that to analyse their voice and assist them on how to pronounce certain words.

Rivet has numerous features to get to that goal. You might want to turn off “Follow Along” mod, for example, so that kids won’t let the app just read for them. They can, instead, tap on a word for reading assist or let Rivet rank them on how good they’re pronounce certain words.

Rivet isn’t the first app from Google targeted at handling reading. A product called Bolo offers similar features, but it’s aimed at Indian kids.

“Struggling readers,” said Rivet’s head of Tech and Product, Ben Turtel, “are unlikely to catch up and four times less likely to graduate from high school. Unfortunately, 64% of fourth-grade students in the United States perform below the proficient level in reading,” explained Turtel toTechCrunch.

Soft Skills & Customer Service Training

Language learning and teaching kids to read are not the only domains where speech technologies are used. Businesses related to customer support as well as sales departments use ASR and AI in order to train employees in soft skills. This method makes learning more practical and responsive. The trainee has to behave and speak with the program as they would with a real human.

The startup called Virtual Speech is promoting eLearning experience of that kind combining VR and Speech Analysis. You can practice your speech or presentation on a virtual stage in front of a virtual crowd that imitates real people's manners and sounds. At the end of your virtual speeches, the app will analyze and score your verbal and nonverbal communication.

elearning speech technologies soft skill Source:

VirtualSpeech applies linguistic analytics and personality theory to show how you are being perceived by the audience. Also, you will receive feedback on eye contact, volume and pace of your speech.

Many companies are already using speech recognition systems to great effect. For example, we at Belitsoft have developed a comprehensive speech recognition suite for a British bank.

The software we have developed consisted of the call-center SR system that could spot keywords in client’s requests and redirect them accordingly. The clients could use the voice biometrics security system to create a voiceprint and apply it as their password. The employees used the system to log in to their work computers or access the restricted information.

Speech analytics was able to discern the emotions of the caller and the operator and estimate adherence to the procedures required by the bank. It could also determine the caller’s sex, approximate age, and even temperament.

These cases prove that voice analytics systems have a sufficient store of tools for use in employee training. So let's recap main speech technologies and systems that can be implemented in eLearning today:

  • ASR can be used to operate the training system;
  • Voice biometrics can be applied for the ongoing verification of the student who is being tested;
  • TTS can be well suited for voice generation of written lectures and long theoretical materials;
  • Speech analysis can correct pronunciation of the words and phrases for foreign languages;
  • Emotion analysis of voice fit in for conflict management and professional dialogue with clients.

Pros & Cons of use Speech Technologies in eLearning


  • Cost-effectiveness. Speech-based eLearning systems are cheaper than live teachers in the long run.
  • Real-life simulations. Speech technologies can analyze arguments of the clients, provide opportunities for language practice, and offer suggestions. Both companies and students can benefit from role-playing with program and prepare for real-life circumstances.
  • Personalization. Communications with AI can be almost on the same level of direct interactions between student and teacher. At the same time, you can talk and feel more comfortable than in an audience with other students.


  • Problems with speech recognition itself. Speech technology still requires a lot of polishing. Probably the major barrier for SR is the large variations among people in how they pronounce words. Not all languages are supported as well as English. And in certain cases hardware requirements make it difficult to implement this technology.
  • Limited applications. The offer for use of Speech Technologies in eLearning is still small. There are already solutions that can fit your needs, but it’s not much to choose from on the market right now.


A Markets and Markets forecast shows that the speech and voice recognition market is expected to reach 21.5 USD billion by 2024 from 7.5 USD billion in 2018. Global tech giants like Google, Amazon, Apple and Microsoft are investing tons of work hours into developing speech-related tools like Alexa and Google Assistant. However other smaller corporations and startups as well are constantly striving to boost the speech technology. Some of them found new ways in which this technology can make eLearning more beneficial and easier for everyone.

From teaching children to read and speak to language learning, from improving customer services to training soft skills – using of ASR, TTS and AI with machine learning algorithms already give a better-personalized learning experience. It’s easy to imagine that with further improvements these technologies can get their way even deeper into eLearning industry in the future.

Never miss a post! Share it!

Written by
CTO / Department Head / Partner
I've been leading a department specializing in custom eLearning software development and Business Intelligence software development for 17 years.
6 reviews

Rate this article

Recommended posts


Speech recognition system for medical center chain
Speech recognition system for medical center chain
For our client, the owner of a private medical center chain from the USA, we developed a speech recognition system integrated with EHR. It saved much time for doctors and nurses working in the company on EHR-related tasks.
Comprehensive Speech Recognition System for a Bank
Comprehensive Speech Recognition System for a Bank
Belitsoft was approached by representatives of a mid-sized bank from the UK. They required an all-encompassing speech recognition (SR) suite for customer service and internal use.

Our Clients' Feedback

Let's Talk Business
Do you have a software development project to implement? We have people to work on it. We will be glad to answer all your questions as well as estimate any project of yours. Use the form below to describe the project and we will get in touch with you within 1 business day.
Contact form
* I give my consent for Belitsoft to process my personal data pursuant to Belitsoft Privacy Policy in order to handle my request and respond to it. I am aware that I have the right to withdraw my consent at any time.
Call us

USA +1 (917) 410-57-57

UK +44 (20) 3318-18-53

Israel +972 53-337-9957

Email us

[email protected]

to top