OTM 2013 - LNCS 8185-8186

On the Likelihood of an Equivalence

Giovanni Bartolomeo¹, Stefano Salsano¹, and Hugh Glaser²

¹University of Rome Tor Vergata, Via del Politecnico 1 00133, Rome, Italy
Giovanni.Bartolomeo@uniroma2.it
Stefano.Salsano@uniroma2.it

²Seme4, Ltd., 18 Soho Square, London, W1D 3QL, UK
Hugh.Glaser@seme4.com

Abstract. Co-references are traditionally used when integrating data from different datasets. This approach has various benefits such as fault tolerance, ease of integration and traceability of provenance; however, it often results in the problem of entity consolidation, i.e., of objectively stating whether all the co-references do really refer to the same entity; and, when this is the case, whether they all convey the same intended meaning. Relying on the sole presence of a single equivalence (owl:sameAs) statement is often problematic and sometimes may even cause serious troubles. It has been observed that to indicate the likelihood of an equivalence one could use a numerically weighted measure, but the real hard questions of where precisely will these values come from arises. To answer this question we propose a methodology based on a graph clustering algorithm.

Keywords: Equivalence Mining, Co-references, Linked Data

LNCS 8186, p. 2 ff.

Full article in PDF | BibTeX