Integration of graphs from different data sources using crowdsourcing
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Kim, Younghoon | - |
dc.contributor.author | Jung, Woohwan | - |
dc.contributor.author | Shim, Kyuseok | - |
dc.date.accessioned | 2021-06-22T14:22:31Z | - |
dc.date.available | 2021-06-22T14:22:31Z | - |
dc.date.issued | 2017-04 | - |
dc.identifier.issn | 0020-0255 | - |
dc.identifier.issn | 1872-6291 | - |
dc.identifier.uri | https://scholarworks.bwise.kr/erica/handle/2021.sw.erica/10042 | - |
dc.description.abstract | Data integration is the process of identifying pairs of records from different databases that refer to the same entity in the real world. It has been extensively studied with regard to entity resolution, record linkage, duplicate detection or network alignment. With the increasing use of crowdsourcing platforms as a means of assessing queries manually at low cost, many studies have begun to consider ways to exploit crowdsourcing systems for efficient data integration. In this paper, we present an efficient algorithm to integrate two graphs collected from different sources using crowdsourcing systems. Given two graphs, we repeatedly select a query node from a graph and request a human annotator to find its matching node from the other graph, which is considered to be the one indicating the same entity as the query node. The proposed method is to choose the query nodes that would increase the precision the most if it is labeled. By experiments with both the simulated answers and the labels collected by real crowdsourcing, we show that our algorithm finds more accurate graph matches with a smaller cost for crowdsourcing than the baseline algorithms. (C) 2017 Elsevier Inc. All rights reserved. | - |
dc.format.extent | 19 | - |
dc.language | 영어 | - |
dc.language.iso | ENG | - |
dc.publisher | Elsevier BV | - |
dc.title | Integration of graphs from different data sources using crowdsourcing | - |
dc.type | Article | - |
dc.publisher.location | 미국 | - |
dc.identifier.doi | 10.1016/j.ins.2017.01.006 | - |
dc.identifier.scopusid | 2-s2.0-85008698185 | - |
dc.identifier.wosid | 000393245000024 | - |
dc.identifier.bibliographicCitation | Information Sciences, v.385, pp 438 - 456 | - |
dc.citation.title | Information Sciences | - |
dc.citation.volume | 385 | - |
dc.citation.startPage | 438 | - |
dc.citation.endPage | 456 | - |
dc.type.docType | Article | - |
dc.description.isOpenAccess | N | - |
dc.description.journalRegisteredClass | sci | - |
dc.description.journalRegisteredClass | scie | - |
dc.description.journalRegisteredClass | scopus | - |
dc.relation.journalResearchArea | Computer Science | - |
dc.relation.journalWebOfScienceCategory | Computer Science, Information Systems | - |
dc.subject.keywordPlus | ALGORITHMS | - |
dc.subject.keywordAuthor | Graph integration | - |
dc.subject.keywordAuthor | Crowdsourcing | - |
dc.subject.keywordAuthor | Entity resolution | - |
dc.identifier.url | https://www.sciencedirect.com/science/article/pii/S002002551730018X?via%3Dihub | - |
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.
55 Hanyangdeahak-ro, Sangnok-gu, Ansan, Gyeonggi-do, 15588, Korea+82-31-400-4269 sweetbrain@hanyang.ac.kr
COPYRIGHT © 2021 HANYANG UNIVERSITY. ALL RIGHTS RESERVED.
Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.