EL REPOSITORIO SE ENCUENTRA EN MARCHA BLANCA

 

Thesis
DISTRIBUTED MACHINE LEARNING WITH CONTEXT AWARENESS

Loading...
Thumbnail Image

Date

2014

Journal Title

Journal ISSN

Volume Title

Publisher

Universidad Técnica Federico Santa María

Abstract

In this thesis a Distributed Machine Learning framework to model dis- tributed data with dierent contexts in the task of regression is presented. Dierent context is dened as the change of the underlying laws of probabil- ity in the distributed sources. Most state of the art methods do not take into account the dierent context and assume that the data comes from the same statistical distribution. We propose an aggregation scheme for models that are in the same neighborhood in terms of similarity by means of clustering algorithms, feedfoward neural networks, stacked generalization models and ensemble approaches. Two proposals are presented. The rst one relies on the theoretical statistical distribution that dierent data sets could have, and with an Hy- pothesis Test based on Divergence Measures is able to create neighborhoods of similar distributed sources. The second one, does not rely on a statis- tical distribution beforehand, and creates neighborhoods using well-known distance metrics and clustering algorithms over a discrete representation of the underlying law of probability. Both of the proposals keep in mind the most important restrictions of Distributed Learning problems, by not sharing \raw'''' data between dis- tributed sites, and not having to upload the data to a central site. Experiments with 5 synthetic and 7 real data sets were conducted in order to validate the proposals. The proposed algorithms outperform in most cases other models that follow a traditional approach.'

Description

Catalogado desde la versión PDF de la tesis

Keywords

Citation

Campus

Casa Central, Valparaíso