Voiceovers in eLearning are more efficient when you use text-to-speech


Picture of a microphone
Is it time to retire the humble microphone?

Instructional Designers and eLearning developers routinely add voiceovers to the eLearning that they develop and toolsets have developed over the years to suit this behaviour, but maybe it’s time to re-examine this again and take a look at text to speech instead?

Speaking a Script

One of the most common methods that is used to add audio to an eLearning module is to write a simple script, get a microphone and hook it up to some sort of recording device (normally your PC) and use something like Audacity (which is an excellent tool by the way) to record your voiceover and perform basic audio editing. Once exported (probably as an .mp3) we import it into our eLearning authoring tool of choice and synchronise it with our content. Some eLearning authoring tools have built in audio recording features, but this is still a generally inefficient method of creating an audio track. Here’s why:

  1. You have to think about what you are going to say and record it without too many “umms” and “errs”, hesitations, mistakes and downright speaking errors or background noise.
  2. If you need to edit the content at a later date then you will probably need to re-record the voiceover, which will involve getting everything set up again – and hopefully the voiceover artist who performed the recording is still working for you.
  3. If (or rather when) you have the need to translate the eLearning into a different language, then you will need to find a voiceover artist who speaks that local language.

There are also numerous debates around if you should have a narration of the text or say something completely different. Personally we are in favour of saying something different if you are going to say something at all, as speaking is generally slower than reading. (We hate having to wait for people to finish speaking something that we’ve already read on the screen).


Text-to-speech has been coming on leaps and bounds in recent years with companies like Acapela and Readspeaker providing enhanced voice synthesis and many different voices to choose from. We hear these voices all around us in everyday life, in situations such as online banking, healthcare and transportation, so why do we resist them when it comes to eLearning and training content? Well, part of the reason is that in the past, they have been pretty bad, and the thinking was that this distracted the learner from actually learning. Nowadays though, even the standard Microsoft voices have improved as they seek to help people with dyslexia. So, maybe it’s time to look at text to speech again as a voiceover technique in eLearning?

Efficiency Gains from Text-to-Speech in eLearning

Here at Knowledge Tek, we conducted a fairly simple trial to see how much time we could save by creating a simple piece of eLearning using a free voice from Microsoft.

Firstly, we created a small piece of eLearning using TT-Knowledge Force using text-to-speech. Please see this video:

YouTube video link
Watch this video

https://youtu.be/a9scFFTtRdc or https://vimeo.com/328605296

We then looked at how long it took us to produce the voiceover, which was from time stamp 1:24 through to 1:56, so it took 32 seconds at a leisurely pace.

Next, we used Audacity to record the same text and produce an .mp3 for each file. After one small ‘speaking’ slip-up along the way, this took us 8 minutes and 3 seconds (483 seconds). So, without factoring in any of the importing of the .mp3’s into an authoring tool, TT-Knowledge Force was already:

(483/32)*100 = 1,509% more efficient.


Here at Knowledge Tek we have now confirmed our long held belief that traditional voiceovers take a considerable length of time to produce and we believe that when a large amount of eLearning material needs to be created, using modern voice synthesis is a way forward. There will always be times when a synthetic voice just won’t do, but if you could save vast amounts of time and a synthetic voice is acceptable – you’d be crackers not to use text to speech.

If you’d like to find out more or arrange a demonstration, please contact us.


Leave a Reply

Your email address will not be published.

5 × five =

This site uses Akismet to reduce spam. Learn how your comment data is processed.