Thesis Estimación del flujo glótico a partir de señales de acelerómetro mediante aprendizaje profundo
Loading...
Date
2025
Authors
Journal Title
Journal ISSN
Volume Title
Program
Ingeniería Civil Electrónica
Departament
Campus
Campus Casa Central Valparaíso
Abstract
La medición del flujo glótico es fundamental para el estudio de la voz pero su obtención directa es invasiva y compleja. Una alternativa no invasiva es el uso de acelerómetros en el cuello, aunque la estimación del flujo a partir de estas señales también presenta desafíos. El objetivo de este trabajo es desarrollar un método basado en aprendizaje profundo para estimar la forma de onda del flujo glótico a partir de las señales de un acelerómetro captadas en la superficie del cuello. Para lograr esto, se utilizó una base de datos existente con señales de voz y acelerometría para generar un conjunto de datos para el entrenamiento. Las señales de flujo glótico de referencia se obtuvieron aplicando el método de filtrado inverso de fase cuasi-cerrada (QCP) a las señales de voz grabadas con micrófono. Se implementó un sistema automatizado para procesar y validar las señales. Posteriormente, se diseñó, entrenó y optimizó una Red Convolucional Temporal (TCN) para aprender a transformar las secuencias de la señal de acelerómetro en las secuencias de flujo glótico correspondientes. Los resultados de la evaluación muestran que el modelo puede estimar la morfología general del flujo glótico. El rendimiento del modelo depende del género y de la condición vocal del hablante, observándose un mayor grado de error en las voces femeninas con patología. El trabajo demuestra la viabilidad de utilizar el modelo para realizar un filtrado inverso sin necesidad de calibración por sujeto.
The measurement of glottal flow is fundamental for the study of the voice, but its direct acquisition is invasive and complex. A non-invasive alternative is the use of accelerometers on the neck, although estimating the flow from these signals also presents challenges. The objective of this work is to develop a method based on deep learning to estimate the waveform of the glottal flow from accelerometer signals captured on the neck surface. To achieve this, an existing database with voice and accelerometry signals was used to generate a dataset for training. The reference glottal flow signals were obtained by applying the quasi-closed phase (QCP) inverse filtering method to the voice signals recorded with a microphone. An automated system was implemented to process and validate the signals. Subsequently, a Temporal Convolutional Network (TCN) was designed, trained, and optimized to learn how to transform the accelerometer signal sequences into the corresponding glottal flow sequences. The evaluation results show that the model can estimate the general morphology of the glottal flow. The model’s performance depends on the speaker’s gender and vocal condition, with a higher degree of error observed in female voices with pathology. The work shows the feasibility of using the model to perform inverse filtering without the need for subject-specific calibration.
The measurement of glottal flow is fundamental for the study of the voice, but its direct acquisition is invasive and complex. A non-invasive alternative is the use of accelerometers on the neck, although estimating the flow from these signals also presents challenges. The objective of this work is to develop a method based on deep learning to estimate the waveform of the glottal flow from accelerometer signals captured on the neck surface. To achieve this, an existing database with voice and accelerometry signals was used to generate a dataset for training. The reference glottal flow signals were obtained by applying the quasi-closed phase (QCP) inverse filtering method to the voice signals recorded with a microphone. An automated system was implemented to process and validate the signals. Subsequently, a Temporal Convolutional Network (TCN) was designed, trained, and optimized to learn how to transform the accelerometer signal sequences into the corresponding glottal flow sequences. The evaluation results show that the model can estimate the general morphology of the glottal flow. The model’s performance depends on the speaker’s gender and vocal condition, with a higher degree of error observed in female voices with pathology. The work shows the feasibility of using the model to perform inverse filtering without the need for subject-specific calibration.
Description
Keywords
Flujo glotal, Aprendizaje profundo, Procesamiento de señales biomédicas, Validación automatizada
