OTM 2009 - LNCS 5870-5872 CD-ROM

Learning Link-Based Naïve Bayes Classifiers from Ontology-Extended Distributed Data

Cornelia Caragea¹, Doina Caragea², and Vasant Honavar¹

¹Computer Science Department, Iowa State University
cornelia@cs.iastate.edu
honavar@cs.iastate.edu

²Computer and Information Sciences, Kansas State University
dcaragea@ksu.edu

Abstract. We address the problem of learning predictive models from multiple large, distributed, autonomous, and hence almost invariably semantically disparate, relational data sources from a user’s point of view. We show under fairly general assumptions, how to exploit data sources annotated with relevant meta data in building predictive models (e.g., classifiers) from a collection of distributed relational data sources, without the need for a centralized data warehouse, while offering strong guarantees of exactness of the learned classifiers relative to their centralized relational learning counterparts. We demonstrate an application of the proposed approach in the case of learning link-based Naïve Bayes classifiers and present results of experiments on a text classification task that demonstrate the feasibility of the proposed approach.

LNCS 5871, p. 1139 ff.

Full article in PDF | BibTeX