Most people experience anxiety with public speaking. When anxious, the human body reacts subconsciously with specific nonverbal expressions through a presenter's voice and body that could be perceived by the audience. Because of the significance of nonverbal expressions, presenters must be trained to efficiently use their bodies and voices as tools for information delivery. Such training is difficult, since nonverbal expressions are mostly given subconsciously. Self-practicing presentation skills, even with a videotape or in front of a mirror, is time-consuming, and people cannot assess their performance from the audience's point of view. Therefore, we have developed an automatic system that could assist presenters by giving advice based on their nonverbal expressions.
Nonverbal expression has been studied for a long time, and it is regarded as a significant channel for communication.1 Over the last decade, the role of nonverbal communication in computing has strengthened thanks to the development of affective computing2 and social signal processing.3 One of the most noticeable situations where nonverbal expression finds its significance is public speaking. Indeed, public speaking is not only about what, but also how, information is delivered. In most cases, the audience is more convinced by the speaker's nonverbal expressions than the speech content itself.4 Consequently, public speaking training has focused on the use of nonverbal expressions, such as body posture, gestures, and eye contact.
Our approach to public speaking training is based on the detection and analysis of nonverbal expressions of speakers (see Figure 1). The system takes input using a range camera such as Microsoft's Kinect that captures both the presenters' body motion and voice. The nonverbal signals detection module is at the heart of the system that analyzes both the body motion and voice of the presenters to extract both good and bad expressions, such as the appropriate use of gestures. Finally, the performance assessment module analyzes the detected expressions and then delivers the final presentation score on a pre-defined scale.
Figure 1. Feedback system overview. A range camera captures data, which is then processed in the signal detection module for nonverbal expression recognition. The output is given to speakers as immediate feedback. The performance assessment module processes detected expressions for the whole presentation to provide a final assessment for presenters as off-line feedback.
The system's goal is to provide informative feedback, but it still offers people natural ways to interact. Thus, there are two forms of feedback, real-time and off-line. Real-time feedback is given during the presentations when bad nonverbal expressions occur. This real-time feedback mechanism was built to help people be more aware of their voice and body. We also implemented off-line feedback, which is only given after the presentation. That feedback is aimed at those who want their presentations to run without interruption. This kind of feedback will be used to improve the overall score for the entire presentation.
The system required implicit research on nonverbal expressions in order to define a list of expressions that may improve public speaking performance. With that purpose, we collected a database of presentations. A regular camera and a Microsoft Kinect were set up in a public speaking class to record short trainee presentations. To date, we collected 64 one-minute presentations (see Figure 2).
Figure 2. Examples from the database, with color (the top 4 images use a regular camera) and depth (the bottom two images use Microsoft's Kinect).
A public speaking expert analyzed the database and flagged the nonverbal expressions of presenters who made a strong impact. The expert also assessed the performances on a scale of 1 to 10, and those evaluations were then digitized using Noldus Observer XT 10.5 commercial software, which also handled the video annotation and analysis of the contribution of each expression to the overall performance.
In parallel, we implemented an algorithm for human body analysis using the Microsoft Kinect as a capture device. The Kinect software development kit gave us the human skeleton representation as well as the 3D position and rotation of body joints. Presently, we are focusing on body posture and gesture. Each body posture is represented through the angles between connected body parts, such as the angle between the shoulders and upper arms. Such representation allows postures to be classified and mapped to our pre-defined list of expressions. In addition, we used Laban Movement Analysis, a method for describing, visualizing, interpreting, and documenting human movement, as an intermediate representation.5 Body gestures were also automatically mapped to Laban's parameters.6
In summary, we introduced our feedback system design for presenters who want to practice their nonverbal communication skills to improve their public speaking. We presented our progress in understanding the nonverbal expression of public speakers with the aim of building up a comprehensive list of expressions that affect a speaker's performance. Also, we presented the initial development of an algorithm for body analysis.
Our future research will primarily focus on automatic recognition of nonverbal expressions, including gesture, posture, facial expressions, eye contact, and voice. In parallel, we will implement a real-time feedback mechanism based on a simulated conference room that aims to give users a realistic learning experience.
This work was supported in part by the Erasmus Mundus Joint Doctorate in Interactive and Cognitive Environments (ICE), which is funded by the Education, Audiovisual and Culture Executive Agency of the European Commission under EMJD ICE FPA 2010-0012.
Anh-Tuan Nguyen, Wei Chen, Matthias Rauterberg
Eindhoven University of Technology
Eindhoven, The Netherlands
1. A. W. Siegman, S. Feldstein, Nonverbal Behavior and Communication, p. 37-64, Psychology Press, 1987.
2. R. Picard, Affective Computing, MIT Press, 2000.
3. A. Pentland, Honest Signals: How They Shape Our World, MIT Press, 2010.
4. C. Reimold, P. Reimold, The Short Road to Great Presentations: How to Reach Any Audience Through Focused Preparation, Inspired Delivery, and Smart Use of Technology, Wiley-IEEE Press, 2003.
5. I. Bartenieff, Body Movement: Coping with the Environment, CRC Press, 1980.
6. A. T. Nguyen, W. Chen, M. Rauterberg, Online Feedback System for Public Speakers, IEEE Symp. e-Learning, e-Management and e-Services, 2012.