What Is Object Recognition and Why Is It Important?

Microsocop examining a slide

Basic Principles of Object Recognition

Object recognition is used in certain software simulations/screen recordings to automatically generate end-user instructions for learners on how to perform a process.

Object recognition has its roots in accessibiity standards, that allow screen readers and other assisitive technologies to help visually impaired people use software. By exposing  input field names, drop down list names, button names and all the other object names that can be included in an application to a recording tool, gramatical constructs can be formed by that tool providing instructions to learners on how to repeat a process.

A vast amount of time is typically taken up by Instructional Designers and Technical Writers preparing and writing training manuals (see this guide on how to do it as one example) and many of the instructions they write end up taking this form:

  • Go to page/screen abc
  • Click on the input field at the top right hand side of the screen for “Username”
  • Type in your username
  • Click on the “Password” field that is underneath the username field
  • Type in your password
  • Click on the “File” menu option
  • etc
  • etc

If the writer is sufficiently motivated, they often go to some lengths including screenshots of the said application in their instructions.

Wouldn’t it be great if these processes could be automatically captured? Well, they can be and more to the point, the actions performed (“Click on the abc…” or “Type xyz into the input feld”)  can also be automatically captured, along with the screenshots and highlights of the actions on the screen.

An example of this looks like this:

A document showing the process steps for how to insert a table in Microsoft Word
This is a resulting view of a set of instructions created from object recognition based screen capture

This screenshot of a document isn’t groundbreaking, until you consider how the document was created, which took about 60 seconds.

Object Recognition in Action

Please watch this video to see the object recognition happening. It’s only 30 seconds long.

YouTube Video
Click to view on YouTube

TIP: It’s best if you watch this in full-screen mode on a desktop or laptop and in full HD and notice how the actions are being captured and words being automatically generated.



After the object recognition has been performed, it’s 2 clicks to generate the training manual. In the example above, we haven’t branded that manual using any templates, but these can also be added very easily.

Multiple Output Types

In addition to using the auto-generated words coming out of object recognition in a training manual, they can also be used in eLearning, quick reference guides and Powerpoint presentations.

Picture of an eLearning tutorial control bar with auto-generated words derived from object recognition
eLearning Tutorial Control


So, object recognition can be used to take the pain out of the training manual creation process. i.e. it makes Instructional Designers and Technical Writers more efficient.

If you’d like to find out more about how object recognition can improve your efficiency, then please contact us to arrange a demonstration.



Leave a Reply

Your email address will not be published.

eighteen + sixteen =

This site uses Akismet to reduce spam. Learn how your comment data is processed.