Moraga Rocco, ClaudioMonge Anwandter, Raúl PatricioAllende Cid, Héctor2024-10-022024-10-022014https://repositorio.usm.cl/handle/123456789/19248Catalogado desde la versión PDF de la tesisIn this thesis a Distributed Machine Learning framework to model dis- tributed data with dierent contexts in the task of regression is presented. Dierent context is dened as the change of the underlying laws of probabil- ity in the distributed sources. Most state of the art methods do not take into account the dierent context and assume that the data comes from the same statistical distribution. We propose an aggregation scheme for models that are in the same neighborhood in terms of similarity by means of clustering algorithms, feedfoward neural networks, stacked generalization models and ensemble approaches. Two proposals are presented. The rst one relies on the theoretical statistical distribution that dierent data sets could have, and with an Hy- pothesis Test based on Divergence Measures is able to create neighborhoods of similar distributed sources. The second one, does not rely on a statis- tical distribution beforehand, and creates neighborhoods using well-known distance metrics and clustering algorithms over a discrete representation of the underlying law of probability. Both of the proposals keep in mind the most important restrictions of Distributed Learning problems, by not sharing \raw'''' data between dis- tributed sites, and not having to upload the data to a central site. Experiments with 5 synthetic and 7 real data sets were conducted in order to validate the proposals. The proposed algorithms outperform in most cases other models that follow a traditional approach.'CD ROMPapelesDISTRIBUTED MACHINE LEARNING WITH CONTEXT AWARENESSTesis PostgradoB - Solamente disponible para consulta en sala (opción por defecto)3560900233244