Discriminative Weight Training for Gender Identification

변별적 가중치 학습을 적용한 성별인식 알고리즘

  • 강상익 (인하대학교 전자공학부) ;
  • 장준혁 (인하대학교 전자공학부)
  • Published : 2008.07.31

Abstract

In this paper, we apply a discriminative weight training to a support vector machine (SVM) based gender identification. In our approach, the gender decision rule is expressed as the SVM of optimally weighted mel-frequency cepstral coefficients (MFCC) based on a minimum classification error (MCE) method which is different from the previous works in that different weights are assigned to each MFCC filter bank which is considered more realistic. According to the experimental results, the proposed approach is found to be effective for gender identification using SVM.

본 논문에서는 성별 인식 시스템의 성능향상을 위해 변별적 가중치 학습 (discriminative weight training) 기반의 최적화된 SVM (support vector machine)을 제안한다. MCE (minimum classification error)방법을 도입하여, 각각의 MFCC (mel-frequency cepstral coefficients) 특징벡터 차수별로 다른 가중치를 가지는 SVM을 제안한다. 제안된 알고리즘은 기존의 동일 가중치를 가지는 SVM 기반의 성별인식 시스템과 비교하였으며, 우수한 성능을 보인다.

Keywords

References

  1. C. Neti and S. Roukos, "Phone-context specific gender-dependent acoustic-models for continuous speech recognition," Proceedings of IEEE Automatic Speech Recognition Understanding Workshop, Santa Barbara, CA, 192-198, Dec. 1997
  2. D. F. Marston,"Gender adapted speech coding,"Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, 1, 12-15, 357-360, May 1998
  3. H. Harb and L. Chen, "Voice-based gender identification in multimedia applications," Journal of Intelligent Information Systems, 24, 179-198, May 2005 https://doi.org/10.1007/s10844-005-0322-8
  4. C. Zheng, and B. Z. Yuan, "Text-dependent speaker identification using circular hidden Markov models," Proceeding of IEEE International Conference on Acoustic, Speech, Signal Processing, S 13.3, 580-582, Mar. 1988
  5. S. Sigurdsson, K. B. Petersen, and T. Lehn-Schioler, "Mel Frequency Cepstral Coefficients: An evaluation of Robustness of MP3 Encoded Music," Proceeding of Int. Conf. Music Inf. Retrieval, 286-289, 2005
  6. 이계환, 강상익, 김덕환, 장준혁, "음성신호 기반의 성별인식을 위한 Support Vector Machines의 적용", 한국음향학회지, 26(2), 75-79, 2월, 2007
  7. Y. Kida, T. Kawahara, "Voice activity detection based on optimally weighted combination of multiple feature," Interspeech, 2621-2624, Sep. 2005
  8. B.-H. Juang, W. Chou, and C.-H. Lee, "Minimum classification error rate methods for speech recognition," IEEE Trans. Speech Audio Processing, 5(3), 257-265, May 1997 https://doi.org/10.1109/89.568732
  9. S.-I. Kang, Q.-H. Jo, J.-H. Chang, "Discriminative Weight Training for A Statistical Model-Based Voice Activity Detection," IEEE Signal Processing Letters, 15, 170-173, Feb. 2008 https://doi.org/10.1109/LSP.2007.913595
  10. Y. K Muthusamy, R. A. Cole and B. T. Oshika, "The OGI multi-language telephone speech corpus," Proceedings of the 1992 International Conference on Spoken Language Processing, 2, 895-898, Oct. 1992