Publikationsdatum:
2013-03-30
Beschreibung:
Due to numerous public information sources and services, many methods to combine heterogeneous data were proposed recently. However, general end-to-end solutions are still rare, especially systems taking into account different context dimensions. Therefore, the techniques often prove insufficient or are limited to a certain domain. In this paper we briefly review and rigorously evaluate a general framework for data matching and merging. The framework employs collective entity resolution and redundancy elimination using three dimensions of context types. In order to achieve domain independent results, data is enriched with semantics and trust. However, the main contribution of the paper is evaluation on five public domain-incompatible datasets. Furthermore, we introduce additional attribute, relationship, semantic and trust metrics, which allow complete framework management. Besides overall results improvement within the framework, metrics could be of independent interest. Content Type Journal Article Pages 119-152 Authors Slavko Žitnik, University of Ljubljana, Faculty of Computer and Information Science, Tržaska cesta 25, SI-1000 Ljubljana, e-mail: {slavko.zitnik, lovro.subelj, dejan.lavbic}@fri.uni-lj.si, marko.bajec@fri.uni-lj.si Lovro Šubelj, University of Ljubljana, Faculty of Computer and Information Science, Tržaska cesta 25, SI-1000 Ljubljana, e-mail: {slavko.zitnik, lovro.subelj, dejan.lavbic}@fri.uni-lj.si, marko.bajec@fri.uni-lj.si Dejan Lavbič, University of Ljubljana, Faculty of Computer and Information Science, Tržaska cesta 25, SI-1000 Ljubljana, e-mail: {slavko.zitnik, lovro.subelj, dejan.lavbic}@fri.uni-lj.si, marko.bajec@fri.uni-lj.si Olegas Vasilecas, Information Systems Research Laboratory, Vilnius Gediminas Technical University, Saulėtekio 11, LT-10223 Vilnius, Lithuania, e-mail: olegas@fm.vgtu.lt Marko Bajec, University of Ljubljana, Faculty of Computer and Information Science, Tržaska cesta 25, SI-1000 Ljubljana, e-mail: {slavko.zitnik, lovro.subelj, dejan.lavbic}@fri.uni-lj.si, marko.bajec@fri.uni-lj.si Journal Informatica Online ISSN 1822-8844 Print ISSN 0868-4952 Journal Volume Volume 24 Journal Issue Volume 24, Number 1 / 2013
Print ISSN:
0868-4952
Digitale ISSN:
1822-8844
Thema:
Informatik
Permalink