DOI QR코드

DOI QR Code

The fundamental frequency (f0) distribution of Korean speakers in a dialogue corpus using Praat and R

Praat과 R로 분석한 한국인 대화 음성 말뭉치의 fundamental frequency(f0)값 분포

  • Byunggon Yang (Department of English Education, Pusan National University)
  • 양병곤 (부산대학교 영어교육과)
  • Received : 2023.07.31
  • Accepted : 2023.09.09
  • Published : 2023.09.30

Abstract

This study examines the fundamental frequency(f0) distribution of 2,740 Korean speakers in a dialogue speech corpus. Praat and R were used for the collection and analysis of acoustical f0 data after removing extreme values considering the interquartile f0 range of the intonational phrases produced by each individual speaker. Results showed that the average f0 value of all speakers was 185 Hz and the median value was 187 Hz. The f0 data showed a positively skewed distribution of 0.11, and the kurtosis was -0.09, which is close to the normal distribution. The pitch values of daily conversations varied in the range of 238 Hz. Further examination of the male and female groups showed distinct median f0 values: 114 Hz for males and 199 Hz for females. A t-test between the two groups yielded a significant difference. The skewness representing the distribution shape was 1.24 for the male group and 0.58 for the female group. The kurtosis was 5.21 and 3.88 for the male and female groups, and the male group values appeared leptokurtic. A regression analysis between the median f0 and age yielded a slope of 0.15 for the male group and -0.586 for the female group, which indicated a divergent relationship. In conclusion, a normative f0 distribution of different Korean age and sex groups can be examined in the conversational speech corpus recorded by a massive number of participants. However, more rigorous data might be required to define a relation between age and f0 values.

이 논문은 국립국어원에서 배포한 한국인 대화 음성 말뭉치에서 화자의 성대의 진동을 나타내는 fundamental frequency(f0)값을 측정해서 한국인이 일상 대화를 할 때 f0값의 기초적인 통계자료를 살펴보고, 나이와 f0값의 분포는 어떤 관계를 보이는지를 조사했다. 연구자료 수집과 분석은 Praat과 R을 이용했고, 개인별 억양구마다 상자도를 구하고 사분위값을 활용하여 극단값을 제거하는 방법으로 최종 f0값 자료를 구했다. 그 결과 전체 한국인들의 f0값의 평균값은 185 Hz이고 중앙값은 187 Hz로 나왔다. 자료의 분포모양을 나타내는 왜도는 0.11의 정적분포를 보였고, 첨도는 -0.09로 정상분포에 거의 가까운 모양을 보였다. 일상대화의 피치값의 변화범위로는 238 Hz로 나타났다. 남녀 간의 f0값의 차이는 남성의 중앙값 114 Hz의 거의 두 배에 해당하는 199 Hz가 여성의 중앙값으로 나타났고 t검증결과 유의미한 차이를 보였다. 분포모양을 나타내는 왜도는 남성이 1.24이었고, 여성은 그것의 반에 해당하는 0.58이었다. 첨도는 남녀집단 각각 5.21과 3.88로 나타나 남성의 값이 34% 정도 더 뾰족한 모양을 보였다. 연령대별로는 남녀집단을 합하여 볼 때, 나이가 들수록 f0값이 서서히 내려가는 경향을 보였다. 연령대별 f0중앙값과 나이 간의 회귀분석을 실행한 결과 기울기가 남성집단에서는 0.15, 여성집단에서는 -0.586으로 서로 반대되는 경향을 기록했다. 결론적으로, 대규모 참여자가 녹음한 대화 음성에서 한국인의 집단별 연령별 다양한 f0분포를 규명할 수 있지만, 나이와 f0관계는 더 정밀한 자료수집이 필요함을 알 수 있었다.

Keywords

Acknowledgement

이 연구는 국립국어원의 '대화 음성 말뭉치 2020'을 내려받아 f0값을 구했으며, 대규모 음성 자료를 만든 국립국어원에 감사드립니다.

References

  1. Baken, R. J. (2005). The aged voice: A new hypothesis. Journal of Voice, 19(3), 317-325. https://doi.org/10.1016/j.jvoice.2004.07.005
  2. Boersma, P., & Weenink, D. (2022). Praat: Doing phonetics by computer (version 6.2.14). [Computer software]. Retrieved from http://www.praat.org/
  3. Boothroyd, A. (1986). Speech acoustics and perception. Austin, TX: Pro-ED.
  4. Fant, G. (1973). Speech sounds and features. Cambridge, MA: MIT Press.
  5. Field, A. (2013). Discovering statistics using IBM SPSS statistics. London, UK: Sage Publications.
  6. Hollien, H., & Shipp, T. (1972). Speaking fundamental frequency and chronologic age in males. Journal of Speech and Hearing Research, 15(1),155-159. https://doi.org/10.1044/jshr.1501.155
  7. Hudson, T., De Jong, G., McDougall, K., Harrison, P., & Nolan, F. (2007, August). F0 statistics for 100 young male speakers of Standard Southern British English. Proceedings of the 16th International Congress of Phonetic Sciences. Saarbrucken, Germany.
  8. Kinoshita, Y., Ishihara, S., & Rose, P. (2009). Exploring the discriminatory potential of f0 distribution parameters in traditional forensic speaker recognition. Journal of Speech, Language and the Law, 16(1), 91-111. https://doi.org/10.1558/ijsll.v16i1.91
  9. Lennes, M., Stevanovic, M., Aalto, D., & Palo, P. (2015). Comparing pitch distributions using Praat and R. Phonetician, 111(2), 35-53.
  10. Lindh, J. (2006). Preliminary descriptive f0-statistics for young male speakers (Research Report of Centre for Languages & Literature, Department of Linguistics & Phonetics, Lund University, pp. 89-92). Lund, Sweden.
  11. National Institute of Korean Language. (2023). Dialogue Corpus (audio) 2020. Retrieved from https://corpus.korean.go.kr/
  12. R Core Team. (2023). R: A language and environment for statistical computing (version 4.3.1) [Computer software]. Vienna, Austria: R Foundation for Statistical Computing. Retrieved from https://www.R-project.org/
  13. Shi, S. J., Zhang, J., & Xie, Y. (2014, September). Cross-language comparison of F0 range in speakers of native Chinese, native Japanese and Chinese L2 of Japanese: Preliminary results of a corpus-based analysis. Proceedings of the 9th International Symposium on Chinese Spoken Language Processing (pp. 241-244). Singapore, Singapore.
  14. Yang, B. (1990). Development of vowel normalization procedures: English and Korean (Doctoral dissertation). The University of Texas at Austin, Austin, TX.
  15. Yang, B. (1996). An analysis of Korean glottal waves by the laryngograph and a perceptual study of synthesized vowels. Korean Journal of Linguistics, 21(4), 1025-1040.
  16. Yang, B. (1998). A study of pitch analysis by Signalize. Dongeui Nonjip, 28, 68-79.
  17. Yang, B. (2018). Pitch trajectories of English vowels produced by American men, women, and children. Phonetics and Speech Sciences, 10(4), 31-37. https://doi.org/10.13064/KSSS.2018.10.4.031
  18. Yang, B. (2021a). Measuring vowels. In R. A. Knight, & J. Setter (Eds.), The Cambridge handbook of phonetics (pp. 261-284). Cambridge, UK: Cambridge University Press.
  19. Yang, B. (2021b). The f0 distribution of Korean speakers in a spontaneous speech corpus. Phonetics and Speech Sciences, 13(3), 31-37. https://doi.org/10.13064/KSSS.2021.13.3.031