조건 사후 최대 확률 기반 최소값 제어 재귀평균기법을 이용한 음성향상

Speech Enhancement Based on Minima Controlled Recursive Averaging Technique Incorporating Conditional MAP

  • 발행 : 2008.07.31

초록

본 논문에서는 기존의 최소값 제어 재귀 평균기법(minima controlled recursive averaging, MCRA) 알고리즘에 조건 사후 최대 확률 (maximun a posteriori, MAP)을 적용한 음성향상을 제안한다. 기존의 MCRA는 파워스펙트럼에 평균을 취하고 각 서브밴드에서 음성 신호 존재 확률로 조절하는 스무딩 매개변수를 사용한다. 본 논문에서 제안된 알고리즘은 현재 프레임에 들어온 신호가 이전 프레임에서의 음성의 존재와 부재에 대한 조건을 부여해 주어 음성 신호 존재확률을 수정하여 음성향상에 적용한다. 제안된 음성 향상은 ITU-T P.862 perceptual evaluation of speech quality (PESQ)와 주관적 음질평가를 이용하여 평가하였고 기존의 MCRA 방법보다 향상된 결과를 나타내었다.

In this paper, we propose a novel approach to improve the performance of minima controlled recursive averaging (MCRA) which is based on the conditional maximum a posteriori criterion. A crucial component of a practical speech enhancement system is the estimation of the noise power spectrum. One state-of-the-art approach is the minima controlled recursive averaging (MCRA) technique. The noise estimate in the MCRA technique is obtained by averaging past spectral power values based on a smoothing parameter that is adjusted by the signal presence probability in frequency subbands. We improve the MCRA using the speech presence probability which is the a posteriori probability conditioned on both the current observation the speech presence or absence of the previous frame. With the performance criteria of the ITU-T P.862 perceptual evaluation of speech quality (PESQ) and subjective evaluation of speech quality, we show that the proposed algorithm yields better results compared to the conventional MCRA-based scheme.

키워드

참고문헌

  1. Y. Ephraim and D. Malah, "Speech enhancement using a minimum mean-square error short-time spectral amplitude estimator," IEEE Transactions on Acoustics, Speech and Signal Processing, ASSP-32(6), 1109-1121, Dec. 1984
  2. Y. Ephraim and D. Malah, "Speech enhancement using a minimum mean-square error log-spectral amplitude estimator," IEEE Transactions on Acoustics, Speech and Signal Processing, ASSP-32(2), 443-445, Apr. 1985
  3. S. F. Boll, "Suppression of acoustic noise in speech using spectral subtraction," IEEE Transactions on Acoustics, Speech and Signal Processing, ASSP-27(2), 113-120, Apr. 1979
  4. N. S. Kim and J. H. Chang, "Spectral enhancement based on global soft decision," IEEE Signal Processing Letters, 7(5), 108-110, May 2000 https://doi.org/10.1109/97.841154
  5. R. Martin,"Spectral subtraction based on minimum statistics," Proceeding of 7th EUSIPCO'94, Edinburgh, U.K., 1182-1185, Sept. 1994
  6. I. Cohen and B. Berdugo, "Noise estimation by minima controlled recursive averaging for robust speech enhancement," IEEE Signal Processing Letters, 9(1), 12-15, Jan. 2002 https://doi.org/10.1109/97.988717
  7. J. W. Shin, H. J. Kwon, S. H. Jin and N. S. Kim, " Voice activity detection based on conditional MAP criterion, " IEEE Signal Processing Letters, 15, 257-260, Feb. 2008 https://doi.org/10.1109/LSP.2008.917027
  8. I. Cohen and B. Berdugo, "Speech enhancement for nonstationary noise environments," Signal Processing, 81, 2403 -2418, Nov. 2001 https://doi.org/10.1016/S0165-1684(01)00128-1
  9. G. Doblinger, "Computationally efficient speech enhancement by spectral minima tracking in subbands," Proceeding of 4th EUROSPEECH'95, Madrid, Spain, 1513-1516, Sept. 1995
  10. J. Meyer, K. U. Simmer and K. D. Kammeyer, "Comparison of one-and two-channel noise-estimation techniques," Proceeding of 5th IWAENC'97, London, U.K., pp. 137-145, Sept. 1997
  11. J. Sohn, N. S. Kim and W. Sung, "A statistical model-based voice activity detection," IEEE Signal Processing Letters, 6(1), 1-3, Jan. 1999