Um Novo Método Usando Autocorrelação para Extração da Freq¨uência Fundamental em Sinais de Voz
DOI:
https://doi.org/10.5540/tema.2007.08.02.0191Abstract
Este artigo descreve o algoritmo de extração da freqüência fundamental do sinal de voz usado na implementação do programa P-NAV (Programa Neuro Analizador Vocal), por Brandão (2006). O método proposto toma como base o algoritmo descrito por Boersma (1993), que usa o método da autocorrelação, e desenvolve quatro algoritmos obtendo, com isso, um método mais robusto para marcar corretamente os períodos do sinal de voz, mesmo em trechos severamente perturbados e diplofônicos.References
P. Boersma, Accurate short-term analysis of the fundamental frequency and the harmonics-to-noise ratio of a sampled sound, IFA Proceedings, 17 (1993), 97-110.
A. Brandão, F.R. Leta, Usando redes neurais para classificação de padrões de voz, em “XXVII CNMAC - Congresso Nacional de Matemática Aplicada e Computacional”, SBMAC, 2005.
A. Brandão, “Classificação de Vozes Naturais e de Vozes Sintetizadas através de Modelos Mecânicos de Laringe e de Trato Vocal usando Redes Neurais”, Dissertação de Mestrado, Universidade Federal Fluminense, Niterói, RJ, 2006.
A. Brandão, E. Cataldo, R. Sampaio, “Análise e Processamento de Sinais”, Apostila, SBMAC, 2005.
J. Cernocky, “Speech Processing Using Automatically Derived Segmental Units”, PhD Thesis, ESIEE, France, 1998.
M.P. Karnell, Laryngeal perturbation analysis: minimum length of analysis window, Journal of Speech and Hearing Research, 34 (1991), 544-548.
A.P. Klapuri, Multiple fundamental frequency estimation based on harmonicity and spectral smoothness, IEEE Transactions on Speech and Audio Processing, 11, No. 6 (2003).
P. Lieberman, Perturbation in vocal pitch, Journal of the Acoustical Society of America, 33 (1961), 597-603.
P. Motlíˇcek, L. Burget, “Reliability Improvement of Speech Pitch Detetion Using Paths”, Institute of Radio Electronics, Faculty of Electrical Engineering, TU Brno, 2000.
L.R. Rabiner, et al., A comparative performance study of several pitch detection algorithms, IEEE Transactions on Acoustics, Speech, and Signal Processing, ASSP-24, No. 5 (1976).
D. Talkin, “A Robust Algorithm for Pitch Tracking (RAPT). Speech Coding and Synthesis”. New York, Elsevier, 1995.
D. Wong, R. Lange, I. Titze, C.G. Guo, Mechanisms of Jitter-Induced Shimmer in a driven model of vocal fold vibration, in “NCVS Status and Progress Report”, pp. 33-41, 1995.
Downloads
Published
How to Cite
Issue
Section
License
Authors who publish in this journal agree to the following terms:
Authors retain copyright and grant the journal the right of first publication, with the work simultaneously licensed under the Creative Commons Attribution License that allows the sharing of the work with acknowledgment of authorship and initial publication in this journal.
Authors are authorized to assume additional contracts separately, for non-exclusive distribution of the version of the work published in this journal (eg, publish in an institutional repository or as a book chapter), with acknowledgment of authorship and initial publication in this journal.
Authors are allowed and encouraged to publish and distribute their work online (eg, in institutional repositories or on their personal page) at any point before or during the editorial process, as this can generate productive changes as well as increase impact and the citation of the published work (See The effect of open access).
This is an open access journal which means that all content is freely available without charge to the user or his/her institution. Users are allowed to read, download, copy, distribute, print, search, or link to the full texts of the articles, or use them for any other lawful purpose, without asking prior permission from the publisher or the
author. This is in accordance with the BOAI definition of open access
Intellectual Property
All the contents of this journal, except where otherwise noted, is licensed under a Creative Commons Attribution License under attribution BY.