VOICE QUALITY MODIFICATION USING THE WORLD VOCODER
Abstract
The role of the self-perception of voice quality has not been explored in the literature and
it is believed to play a critical role in the development of hyperfunctional voice disorders.
By modifying the auditory feedback, researchers can further investigate the underlying laryngeal motor control mechanisms. However, there are no synthesizers capable of changing
voice quality in a controlled manner. Thus, this thesis introduces a Vocoder capable of
introducing voice quality changes with very low latency. The proposed Vocoder is based on
the known WORLD synthesizer, and can simultaneously modify fundamental frequency,
spectral envelope, and voice aperiodicity. An excitation signal is generated using the Rosenberg++ glottal pulse and a wave shape parameter, along with parameters that allow for
the fine-tuning of the Rosenberg++ pulse. Frequency and amplitude modulations can also
be utilized with Brownian noise and sinusoidal signals. It is also possible to modify the
fundamental frequency of the input signal with different parameters such as a multiplier,
filtering, and added vibrato. Results illustrate that the resulting voice quality is natural
and that it is possible to synthesize Breathy, Rough, Dysphonic, Vocal Fry and Modal
voice. An objective assessment of voice quality changes is performed to quantify the resulting resynthesized output and the effect of the parameters used to control voice quality.
Current implementation is performed offline, but the Vocoder implementation in real-time
in an embedded system is feasible.
Collections
- Arq_paso [212]