Thesis Desempeño de la verosimilitud compuesta para datos espaciales
Loading...
Date
2024-07
Authors
Journal Title
Journal ISSN
Volume Title
Program
Ingeniería Civil Matemática
Campus
Campus Casa Central Valparaíso
Abstract
En esta memoria se aborda el análisis de datos georeferenciados a gran escala, enfocado en los modelos de procesos gaussianos espaciales y las metodologías necesarias para su análisis . La investigación se centra en la inferencia dentro de estos modelos, especialmente cuando se manejan grandes conjuntos de datos que presentan desafíos computacionales significativos debido al tamaño y complejidad de las matrices de covarianza involucradas.
Se exploran dos estrategias para mitigar la carga computacional, ambas basadas en el uso de la verosimilitud compuesta introducida por Lindsay (1988). Esta técnica permite capturar la dependencia espacial y disminuir el tamaño de las matrices involucradas, lo que reduce significativamente el tiempo de cálculo . La primera estrategia considera la relación de los bloques con sus bloques adyacentes utilizando pares de bloques, mientras que la segunda asume independencia entre los bloques y los trabaja individualmente. Además , se analiza cómo afecta la manera en la que se seleccionan los bloques. Para ello, se estudian dos enfoques: el enfoque por filas (RW), en el cual los bloques se organizan en una cuadrícula regular, y el enfoque por columnas (CW), en el que los bloques están formados por Múltiples columnas no adyacentes.
Se concluye que la estrategia que considera los pares de bloques proporciona mejores estimaciones, pero es menos eficiente computacionalmente que la que considera independencia. Para ambos enfoques, el método CW entrega mejores resultados. El estudio incluye experimentos numéricos y análisis comparativos entre las distintas estrategias, además de una revisión de los conceptos fundamentales de la estadística espacial y los procesos estocásticos . Finalmente, se discuten las conclusiones y posibles trabajos futuros.
This work deals with the analysis of great-scale geo-referenced data, focusing on spatial Gaussian processes and the necessary methodologies for their analysis. The investigation centers itself on the inference of these models, specially when dealing with big sets of data, which presents significant computational challenges given the size and complexity of the covariance matrices involved. Two strategies for the mitigation of the computational load are explored, both based on the use of composite likelihood introduced by Lindsay (1988). This technique allows the capture of spatial dependency and diminishes the size of the involved matrices, which significantly reduces the time of calculation. The first strategy considers thee relation of the blocks with it’s adjacent blocks using block-pairs, while the second assumes independence between the blocks and treats them individually. Furthermore, the effects of the way the blocks are chosen is analyzed. To this end, two approaches are studied: the row-wise approach (RW), in which the blocks are arranged into a regular grid, and the column-wise approach (CW), in which every block is made from multiple non-adjacent colummns. We conclude that the strategy that considers the block-pairs renders the best estimators, but is less efficient than the one that considers independence. For both approaches, the CW method returns the best results. The study includes numerical experiments and comparative analysis between the different strategies, as well as a revision of the fundamental concepts of the spatial statistic and the stochastic processes. Finally, the conclusions and possible future works are discussed.
This work deals with the analysis of great-scale geo-referenced data, focusing on spatial Gaussian processes and the necessary methodologies for their analysis. The investigation centers itself on the inference of these models, specially when dealing with big sets of data, which presents significant computational challenges given the size and complexity of the covariance matrices involved. Two strategies for the mitigation of the computational load are explored, both based on the use of composite likelihood introduced by Lindsay (1988). This technique allows the capture of spatial dependency and diminishes the size of the involved matrices, which significantly reduces the time of calculation. The first strategy considers thee relation of the blocks with it’s adjacent blocks using block-pairs, while the second assumes independence between the blocks and treats them individually. Furthermore, the effects of the way the blocks are chosen is analyzed. To this end, two approaches are studied: the row-wise approach (RW), in which the blocks are arranged into a regular grid, and the column-wise approach (CW), in which every block is made from multiple non-adjacent colummns. We conclude that the strategy that considers the block-pairs renders the best estimators, but is less efficient than the one that considers independence. For both approaches, the CW method returns the best results. The study includes numerical experiments and comparative analysis between the different strategies, as well as a revision of the fundamental concepts of the spatial statistic and the stochastic processes. Finally, the conclusions and possible future works are discussed.
Description
Keywords
Datos científicos georeferenciados, Modelos estadísticos espaciales, Experimentos numéricos