DOI QR코드

DOI QR Code

Extraction of Speech Features for Emotion Recognition

감정 인식을 위한 음성 특징 도출

  • Received : 2012.02.27
  • Accepted : 2012.05.14
  • Published : 2012.06.30

Abstract

Emotion recognition is an important technology in the filed of human-machine interface. To apply speech technology to emotion recognition, this study aims to establish a relationship between emotional groups and their corresponding voice characteristics by investigating various speech features. The speech features related to speech source and vocal tract filter are included. Experimental results show that statistically significant speech parameters for classifying the emotional groups are mainly related to speech sources such as jitter, shimmer, F0 (F0_min, F0_max, F0_mean, F0_std), harmonic parameters (H1, H2, HNR05, HNR15, HNR25, HNR35), and SPI.

Keywords

References

  1. Vicholson, J., Takahashi, K., Nakatsu, R., (2000). Emotion recognition in speech using neural networks, Neural Computing and Application, Vol. 9, 290-296. https://doi.org/10.1007/s005210070006
  2. Kang, M. G., Seo, J. T., Kim, W. G., (2004). Emotion recognition based on GMM using speech signals, J. of Acoustical Society of Korea, Vol. 23, No. 3, 235-241. (강면구, 서정태, 김원구 (2004). 음성신호를 사용한 GMM 기반의 감정인식, 한국음향학회지, 23권 3호, 235-241.)
  3. Razak, A., Komiya, R., Abidin, M., (2005). Comparison between fuzzy and NN method for speech emotion recognition, Proc. of the Third International Conference on Information Technology and Applications, Vol. 1, 297-302.
  4. Cho, Y. H., Park, G. S., (2006), A study on robust speech emotion feature extraction under the mobile communication environment, J. of Acoustical Society of Korea, Vol. 25, No. 6, 269-276. (조윤호, 박규식 (2006). 이동통신 환경에서 강인한 음성 감성특징 추출에 관한 연구, 한국음향학회지, 25권 6호, 269-276.)
  5. Jang, K. D., Kim, N., Kwon, O. W., (2006). Speech emotion recognition on a simulated intelligent robot, Malsori, Vol. 56, 173-183. (장광동, 김남, 권오욱 (2006). 모의 지능 로봇에서의 음성 감정인식, 말소리, 56권, 173-183.)
  6. Jung, B. W., Cheun, S. P., Kim, Y. T., Kim, S. S., (2008). An emotion recognition technique using speech signals, J. of Korean Institute of Intelligent Systems, Vol. 18, No. 4, 494-500. (정병욱, 천성표, 김연태, 김성신 (2008). 음성신호를 이용한 감정인식, 한국지능시스템학회 논문지, 18권 4호, 494-500.) https://doi.org/10.5391/JKIIS.2008.18.4.494
  7. Han, S. M., Kim, S. B., Kim, J. Y., Kwon, C. H., (2011). A preliminary study on correlation between voice characteristics and speech features, The Phonetics and Speech Sciences, Vol. 3, No. 4, 85-91. (한성만, 김상범, 김종열, 권철홍 (2011), 목소리 특성의 주관적 평가와 음성 특징과의 상관관계 기초연구, 한국음성학회, 말소리와 음성과학, 3권 4호, 85-91)
  8. Seong, T. J., (2007). Understanding and application of modern basic statistics, Kyoyookbook. (성태제 (2007). 현대 기초통계학의 이해와 적용, 교육과학사.)
  9. Wayland, R., Jongman, A. (2003). Acoustic correlates of breathy and clear vowels: the case of Khmer, Journal of Phonetics, Vol. 31, 181-201. https://doi.org/10.1016/S0095-4470(02)00086-4
  10. Iseli, M., Shue, Y. L., Alwan, A. (2007). Age, sex, and vowel dependencies of acoustic measures related to the voice source, Journal of Acoustical Society of America, Vol. 121, No. 4, 2283-2295. https://doi.org/10.1121/1.2697522
  11. C. T. Ferrand, (2002). Harmonics-to-Noise Ratio: an index of vocal aging, Journal of Voice, Vol. 16, No. 4, 480-487. https://doi.org/10.1016/S0892-1997(02)00123-6
  12. Boersma, P. (1993). Accurate short-term analysis of the fundamental frequency and the harmonics-to-noise ratio of a sampled sound, Proceedings of Institute of Phonetic Sciences, Vol. 17, 97-110.

Cited by

  1. Characteristics of the Korean speakers' voice under easy Korean, difficult Korean and English reading situations vol.8, pp.1, 2016, https://doi.org/10.13064/KSSS.2016.8.1.001
  2. Qualitative Classification of Voice Quality of Normal Speech and Derivation of its Correlation with Speech Features vol.6, pp.1, 2014, https://doi.org/10.13064/KSSS.2014.6.1.071