This article will explain the differences between well known “Audio Description” (AD) and “Visual description” in the way how we imagine / use it.
Audio description
Audio description, also referred to as a video description, described video, or more precisely called a visual description, is an additional narration track intended primarily for blind and visually impaired consumers of visual media (including television and film, dance, opera, and visual art). It consists of a narrator talking through the presentation, describing what is happening on the screen or stage during the natural pauses in the audio, and sometimes during dialogue if deemed necessary.
– Wikipedia
The Visual Description model
We would like to take this experience of audio description for blind people from an another perspective. We would like to give more detailed information on what is happening during the scene, and provide meta information that can be listened before or after the video. This meta information is character and location description.
Visual description has a specific structure which is: characters, locations and scenes.
First of all we need to explain the characters, who can be seen in the video, each character has it’s own description. The order of the character descriptions are starting from the main / most seen character down to the least involved one. Key information when describing a character:
- Gender
- Age
- Race
- Look (body type, hair, eye, any specificity of the look and feel of the person)
- Name (if known) – if not known we need to find out an arbitrary name to identify the character
We also need to explain the locations
- Indoor / Outdoor
- What can we see in general on this location
And finally the scenes
- Which location are we on (from the location list)
- Which characters are on this scene
- What is happening on the scene
- What interaction is done between the characters, what is their mood, do the physically interact?
- When creating the scenes, it will be important to set the start timestamp of that specific scene and make sure the whole video playback is covered with scene descriptions
Blind people have various playback options. It’s possible to listen to all metadata and narration before or after viewing the entire video, either automatically or one by one. It is also possible to automatically pause the video before each scene, listen to the description and then move on to the scene.