Thesis Detección de discursos de odio en redes sociales chilenas mediante métodos de aprendizaje automático
Loading...
Date
2023
Authors
Journal Title
Journal ISSN
Volume Title
Program
Ingeniería Civil Informática
Campus
Campus Santiago San Joaquín
Abstract
Desarrollar herramientas para identificar Discursos de Odio en redes sociales es un paso importante para la comprensión de este fenómeno a nivel global. Lamentablemente, la gran mayoría de los recursos disponibles se encuentran orientados al idioma inglés.
Por otro lado, los pocos recursos existentes en español no suelen tomar en cuenta la gran diversidad de dialectos existentes, lo que dificulta evaluar el impacto de características socioculturales específicas al momento de generalizar soluciones de un dialecto o lengua a otra.
En este trabajo se introduce un nuevo recurso orientado al estudio del Discurso de Odio con tweets en el dialecto chileno del español. El corpus incluye 4572 registros recolectados de modo semi-automático de Twitter y posteriormente validados y anotados manualmente por un grupo de 15 humanos, garantizando 3 anotaciones independientes por tweet. El dataset incluye un 45,5 % de casos positivos para la clase Hate Speech, entregando así datos mejor balanceados que los recursos ya existentes para este idioma. Las anotaciones entregadas permiten también identificar en cada caso si el tweet se refiere o no a una de cuatro comunidades que suelen ser objetivo de mensajes provocativos u ofensivos: mujeres, migrantes, pueblos originarios, y comunidad LGBTQ+, permitiendo así análisis más detallados del fenómeno. Junto con lo anterior, se dispone junto a cada tweet el hilo o conversación del cual forma parte, enriqueciendo los actuales recursos disponibles en términos contextuales.
Junto a la metodología de recolección y anotación empleada, este trabajo presenta una serie de experimentos orientados a evaluar la calidad del corpus obtenido utilizando tres modelos multi-lingua y tres datasets alternativos del estado del arte. En primer lugar, se estudió la transferibilidad del recurso propuesto, es decir, su capacidad para mejorar la detección de odio sobre otros datasets. En segundo lugar, se realizaron experimentos orientados a detectar la presencia de sesgos en los datos recolectados considerando siete grupos o comunidades protegidas. Finalmente, se efectuaron 27 pruebas funcionales recientemente propuestas por HATECHECK[55] para evaluar modelos en casos en que éstos tienden a fallar o sobre-especializarse.
Los resultados obtenidos demuestran que, con respecto a su único símil en dialecto chileno, el dataset generado es más transferible a otros dialectos e idiomas, exhibe menores sesgos, y permite superar con mayor éxito la mayoría de las pruebas funcionales, incluso en dialectos diferentes al entrenado. Comparando con recursos en otros idiomas, el dataset generado es altamente competitivo, permitiendo por ejemplo superar el estado del arte en 16 de las 27 pruebas funcionales de HATECHECK y logrando la mejor transferencia a español chileno e inglés. Como desventajas, el corpus mostró sesgos ligeramente más altos que datasets en otros idiomas y no logró mejorar la transferencia al español castellano, temas que deben ser abordados con mayor profundidad en el futuro. Todo el material generado ha sido publicado para facilitar su uso y reproducción por parte de la comunidad. IEEE Dataport
Developing algorithms to identify Hate Speech in social media is essential in studying these phenomena globally. Sadly, most available corpora to build these algorithms are in English. On the other hand, the few available Spanish resources do not cover the great diversity of dialects existing for this language. As a result, studying specific sociocultural attributes of Hate Speech in most Spanish-speaking countries like Chile is challenging because algorithms need to be transferred from other contexts. This work introduces a new corpus focused on Chilean Hate Speech. The corpus includes 4572 tweets recollected from Twitter in a semi-automatic fashion, which were subsequently annotated and validated by a group of 15 humans, ensuring three independent annotations by tweet. The dataset includes 45.5 % of positive cases of Hate Speech, which corresponds to a significantly better-balanced dataset than existing resources for this language. Our annotations also allow identifying whether the tweet refers to one of four communities often targeted by hate speech: women, migrants, indigenous peoples, and the LGBTQ+ community, thus supporting a more detailed analysis of the phenomenon. In addition, we make available the thread or conversation to which the tweet belongs, enriching the currently available resources in contextual terms. Besides the data collection and annotation methodology, this manuscript presents a series of experiments to evaluate the quality of the obtained corpus using three multi-lingual models and three alternative state-of-the-art datasets. Firstly, we analyze the transferability of the proposed resource, i.e., its ability to enhance hate detection on other datasets. Secondly, we conduct experiments to detect the presence of biases in the collected data, considering seven protected groups or communities. Finally, 27 functional tests, recently proposed by HATECHECK [55], were carried out to evaluate models in cases where they tend to fail or over-specialize. The obtained results demonstrate that, compared to its only counterpart in the Chilean dialect, the generated dataset is more transferable to other dialects and languages, exhibits fewer biases, and allows for achieving better results in the majority of functional tests, even though these tests are a different language. Compared with resources in other languages, the generated dataset is highly competitive, outperforming the state-of-the-art in 16 out of 27 functional tests from HATECHECK and achieving the best transferability to v Chilean Spanish and English. Among the disadvantages, we observed slightly more biased predictions and, training state-of-the-art models on the corpus, we could not improve the transference to Standard Spanish. All the generated data and code have been made publicly available to facilitate their use by the community IEEE Dataport
Developing algorithms to identify Hate Speech in social media is essential in studying these phenomena globally. Sadly, most available corpora to build these algorithms are in English. On the other hand, the few available Spanish resources do not cover the great diversity of dialects existing for this language. As a result, studying specific sociocultural attributes of Hate Speech in most Spanish-speaking countries like Chile is challenging because algorithms need to be transferred from other contexts. This work introduces a new corpus focused on Chilean Hate Speech. The corpus includes 4572 tweets recollected from Twitter in a semi-automatic fashion, which were subsequently annotated and validated by a group of 15 humans, ensuring three independent annotations by tweet. The dataset includes 45.5 % of positive cases of Hate Speech, which corresponds to a significantly better-balanced dataset than existing resources for this language. Our annotations also allow identifying whether the tweet refers to one of four communities often targeted by hate speech: women, migrants, indigenous peoples, and the LGBTQ+ community, thus supporting a more detailed analysis of the phenomenon. In addition, we make available the thread or conversation to which the tweet belongs, enriching the currently available resources in contextual terms. Besides the data collection and annotation methodology, this manuscript presents a series of experiments to evaluate the quality of the obtained corpus using three multi-lingual models and three alternative state-of-the-art datasets. Firstly, we analyze the transferability of the proposed resource, i.e., its ability to enhance hate detection on other datasets. Secondly, we conduct experiments to detect the presence of biases in the collected data, considering seven protected groups or communities. Finally, 27 functional tests, recently proposed by HATECHECK [55], were carried out to evaluate models in cases where they tend to fail or over-specialize. The obtained results demonstrate that, compared to its only counterpart in the Chilean dialect, the generated dataset is more transferable to other dialects and languages, exhibits fewer biases, and allows for achieving better results in the majority of functional tests, even though these tests are a different language. Compared with resources in other languages, the generated dataset is highly competitive, outperforming the state-of-the-art in 16 out of 27 functional tests from HATECHECK and achieving the best transferability to v Chilean Spanish and English. Among the disadvantages, we observed slightly more biased predictions and, training state-of-the-art models on the corpus, we could not improve the transference to Standard Spanish. All the generated data and code have been made publicly available to facilitate their use by the community IEEE Dataport
Description
Keywords
Procesamiento de lenguaje natural, Conjunto de datos, Aprendizaje automático