Journal of Zhejiang University

Frontiers of Information Technology & Electronic Engineering 2015 Vol.16 No.4 P.272-282

Using Kinect for real-time emotion recognition via facial expressions

Author(s): Qi-rong Mao, Xin-yu Pan, Yong-zhao Zhan, Xiang-jun Shen
Affiliation(s): 1. Department of Computer Science and Communication Engineering, Jiangsu University, Zhenjiang 212013, China
Corresponding email(s): mao_qr@ujs.edu.cn, pxyz@vip.qq.com
Key Words: Kinect, Emotion recognition, Facial expression, Real-time classification, Fusion algorithm, Support vector machine (SVM)

Share this article to： More <<< Previous Article \|Next Article >>>

Qi-rong Mao, Xin-yu Pan, Yong-zhao Zhan, Xiang-jun Shen. Using Kinect for real-time emotion recognition via facial expressions[J]. Frontiers of Information Technology & Electronic Engineering, 2015, 16(4): 272-282.

@article{title="Using Kinect for real-time emotion recognition via facial expressions",
author="Qi-rong Mao, Xin-yu Pan, Yong-zhao Zhan, Xiang-jun Shen",
journal="Frontiers of Information Technology & Electronic Engineering",
volume="16",
number="4",
pages="272-282",
year="2015",
publisher="Zhejiang University Press & Springer",
doi="10.1631/FITEE.1400209"
}

%0 Journal Article
%T Using Kinect for real-time emotion recognition via facial expressions
%A Qi-rong Mao
%A Xin-yu Pan
%A Yong-zhao Zhan
%A Xiang-jun Shen
%J Frontiers of Information Technology & Electronic Engineering
%V 16
%N 4
%P 272-282
%@ 2095-9184
%D 2015
%I Zhejiang University Press & Springer
%DOI 10.1631/FITEE.1400209

TY - JOUR
T1 - Using Kinect for real-time emotion recognition via facial expressions
A1 - Qi-rong Mao
A1 - Xin-yu Pan
A1 - Yong-zhao Zhan
A1 - Xiang-jun Shen
J0 - Frontiers of Information Technology & Electronic Engineering
VL - 16
IS - 4
SP - 272
EP - 282
%@ 2095-9184
Y1 - 2015
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/FITEE.1400209

Abstract
Chinese Summary
Academic Network
Reviewer Comment

Abstract: emotion recognition via facial expressions (ERFE) has attracted a great deal of interest with recent advances in artificial intelligence and pattern recognition. Most studies are based on 2D images, and their performance is usually computationally expensive. In this paper, we propose a real-time emotion recognition approach based on both 2D and 3D facial expression features captured by kinect sensors. To capture the deformation of the 3D mesh during facial expression, we combine the features of animation units (AUs) and feature point positions (FPPs) tracked by kinect. A fusion algorithm based on improved emotional profiles (IEPs) and maximum confidence is proposed to recognize emotions with these real-time facial expression features. Experiments on both an emotion dataset and a real-time video show the superior performance of our method.

This paper proposes a method for facial expression recognition by combining 2D and 3D data which are captured by Kinect. The presented approach to FER (or ER) is rather obvious and straightforward, nevertheless it is valuable and worthy to be published.

基于Kinect的实时面部情感识别

目的：基于Kinect提取面部表情特征，实现对视频中面部情感实时识别。
创新点：基于Kinect提出一种新颖的面部情感识别方法，充分发挥出Kinect高速易用的优势，可实时对视频序列最新连续帧中表现出的情感进行综合判定。识别方法涉及两种不同形式的表情特征，对此我们也针对性地提出基于最大置信的融合算法。
方法：首先，运用Kinect中Face Tracking SDK从实时视频数据中追踪人脸、提取面部运动单元信息和特征点坐标（图3、4）。然后，将这两类特征信息并行处理，在它们各自特征通道中，特征数据经7元1-vs-1分类器组进行预识别，将得到的预识别结果存入缓存用于情感置信统计，置信度最高的即为此通道中的情感识别结果（图2）。最后，融合这两个特征通道的结果即可得到最终情感识别结果（图1）。
结论：基于Kinect提取到的两种面部表情特征（面部运动单元信息和特征点坐标），提出一种新颖高效的面部情感识别方法，实现对视频中面部情感实时识别。

关键词：Kinect；情感识别；面部表情；实时分类；融合算法；支持向量机

Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article

Reference

[1]Ahlberg, J., 2001. Candide—an Updated Parameterised Face. Technical Report.

[2]Breidt, M., Biilthoff, H.H., Curio, C., 2011. Robust semantic analysis by synthesis of 3D facial motion. Proc. IEEE Int. Conf. on Automatic Face & Gesture Recognition and Workshops, p.713-719.

[3]Cao, C., Weng, Y.L., Zhou, S., et al., 2014. FaceWarehouse: a 3D facial expression database for visual computing. IEEE Trans. Visual. Comput. Graph., 20(3):413-425.

[4]Chang, C.C., Lin, C.J., 2011a. LIBSVM: a Library for Support Vector Machines. Available from http://www.csie.ntu.edu.tw/∼cjlin/libsvm/index.html.

[5]Chang, C.C., Lin, C.J., 2011b. LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol., 2(3):27.

[6]Cosker, D., Krumhuber, E., Hilton, A., 2011. A FACS valid 3D dynamic action unit database with applications to 3D dynamic morphable facial modeling. Proc. IEEE Int. Conf. on Computer Vision, p.2296-2303.

[7]Ekman, P., 1993. Facial expression and emotion. Am. Psychol., 48(4):384-392.

[8]Ekman, P., Friesen, W.V., 1978. Facial action coding system: a technique for the measurement of facial movement. Palo Alto.

[9]Fasel, B., Luettin, J., 2003. Automatic facial expression analysis: a survey. Patt. Recog., 36(1):259-275.

[10]Hg, R.I., Jasek, P., Rofidal, C., et al., 2012. An RGB-D using Microsoft's Kinect for windows for face detection. Proc. 8th Int. Conf. on Signal Image Technology and Internet Based Systems, p.42-46.

[11]Li, B.Y., Mian, A.S., Liu, W.Q., et al., 2013. Using Kinect for face recognition under varying poses, expressions, illumination and disguise. Proc. IEEE Workshop on Applications of Computer Vision, p.186-192.

[12]Li, D.X., Sun, C., Hu, F.Q., et al., 2013. Real-time performance-driven facial animation with 3ds Max and Kinect. Proc. 3rd Int. Conf. on Consumer Electronics, Communications and Networks, p.473-476.

[13]Ma, X.H., Tan, Y.Q., Zheng, G.M., 2013. A fast classification scheme and its application to face recognition. J. Zhejiang Univ.-Sci. C (Comput. & Electron.), 14(7):561-572.

[14]Mao, Q.R., Zhao, X.L., Huang, Z.W., et al., 2013. Speaker-independent speech emotion recognition by fusion of functional and accompanying paralanguage features. J. Zhejiang Univ.-Sci. C (Comput. & Electron.), 14(7):573-582.

[15]Nicolaou, M.A., Gunes, H., Pantic, M., 2011. Continuous prediction of spontaneous affect from multiple cues and modalities in valence-arousal space. IEEE Trans. Affect. Comput., 2(2):92-105.

[16]Savran, A., Alyüz, N., Dibeklioğlu, H., et al., 2008. Bosphorus database for 3D face analysis. Proc. 1st European Workshop on Biometrics and Identity Management, p.47-56.

[17]Seddik, B., Maâmatou, H., Gazzah, S., et al., 2013. Unsupervised facial expressions recognition and avatar reconstruction from Kinect. Proc. 10th Int. Multi-conf. on Systems, Signals & Devices, p.1-6.

[18]Stratou, G., Ghosh, A., Debevec, P., et al., 2011. Effect of illumination on automatic expression recognition: a novel 3D relightable facial database. Proc. IEEE Int. Conf. on Automatic Face & Gesture Recognition and Workshops, p.611-618.

[19]van den Hurk, Y., 2012. Gender Classification with Visual and Depth Images. MS Thesis, Tilburg University, the Netherlands.

[20]Vinciarelli, A., Pantic, M., Heylen, D., et al., 2012. Bridging the gap between social animal and unsocial machine: a survey of social signal processing. IEEE Trans. Affect. Comput., 3(1):69-87.

[21]Xu, S.B., Ma, G.H., Meng, W.L., et al., 2013. Statistical learning based facial animation. J. Zhejiang Univ.-Sci. C (Comput. & Electron.), 14(7):542-550.

[22]Zeng, Z., Pantic, M., Roisman, G.I., et al., 2009. A survey of affect recognition methods: audio, visual, and spontaneous expressions. IEEE Trans. Patt. Anal. Mach. Intell., 31(1):39-58.

[23]Zhu, X.X., Ramanan, D., 2012. Face detection, pose estimation, and landmark localization in the wild. Proc. IEEE Conf. on Computer Vision and Pattern Recognition, p.2879-2886.

Open peer comments: Debate/Discuss/Question/Opinion

<1>