Sonification is the use of non-speech audio, encompassing the parameters of pitch, intensity, and tempo, to represent patterns and trends in data. Sonification becomes increasingly relevant as visual senses become overloaded, fatigued, or impractical for information communication (combat scenarios, large dataset exploration, etc) due to advantages of audio communication over visualization (immersive, not limited by device size). Creating a theory of successful sonification design based on user audience was the purpose of this study, which focused on understanding the key parameters that drive the comprehension of and reaction time to sonifications that are a reasonable alternative to the same information being presented through visualization.
Two experiments were formed, testing listener comprehension and listener reaction time. The comprehension test involved matching patterns of data presented through sonification to a visual counterpart. Findings show that untrained subjects are accurately able to sonification to the corresponding visualization over 60% of the time, but are unable to map the temporal location of a pattern presented through audio to its spatial location on a corresponding visual representation. The reaction time test involved assessing how quickly subjects were able to recognize and react to a pattern of tones within a background audio stream. Findings show that accurate recognition and reaction time to a 5-tone pattern is noticeably higher that for a 1- or 9-tone pattern. The researchers found that the larger the temporal gap between the relevant patterns inserted into the background noise, the higher the likelihood that the subject would recognize and react to the pattern.
This study starts to lay the foundation for how to successfully sonify data that is currently presented in visual form, and holds the potential of driving breakthrough changes in human-computer interfaces similar in impact to those realized by moving from character based screens to graphical user interfaces.