Detailed Information

Cited 0 time in webofscience Cited 0 time in scopus
Metadata Downloads

Author name disambiguation using a graph model with node splitting and merging based on bibliographic information

Authors
Shin, DongwookKim, TaehwanChoi, JoongminKim, Jungsun
Issue Date
Jul-2014
Publisher
Akademiai Kiado
Keywords
Author name disambiguation; Graph model; Namesake resolution; Heteronymous name resolution; Digital library
Citation
Scientometrics, v.100, no.1, pp.15 - 50
Indexed
SCIE
SSCI
SCOPUS
Journal Title
Scientometrics
Volume
100
Number
1
Start Page
15
End Page
50
URI
https://scholarworks.bwise.kr/erica/handle/2021.sw.erica/22417
DOI
10.1007/s11192-014-1289-4
ISSN
0138-9130
Abstract
Author ambiguity mainly arises when several different authors express their names in the same way, generally known as the namesake problem, and also when the name of an author is expressed in many different ways, referred to as the heteronymous name problem. These author ambiguity problems have long been an obstacle to efficient information retrieval in digital libraries, causing incorrect identification of authors and impeding correct classification of their publications. It is a nontrivial task to distinguish those authors, especially when there is very limited information about them. In this paper, we propose a graph based approach to author name disambiguation, where a graph model is constructed using the co-author relations, and author ambiguity is resolved by graph operations such as vertex (or node) splitting and merging based on the co-authorship. In our framework, called a Graph Framework for Author Disambiguation (GFAD), the namesake problem is solved by splitting an author vertex involved in multiple cycles of coauthorship, and the heteronymous name problem is handled by merging multiple author vertices having similar names if those vertices are connected to a common vertex. Experiments were carried out with the real DBLP and Arnetminer collections and the performance of GFAD is compared with three representative unsupervised author name disambiguation systems. We confirm that GFAD shows better overall performance from the perspective of representative evaluation metrics. An additional contribution is that we released the refined DBLP collection to the public to facilitate organizing a performance benchmark for future systems on author disambiguation.
Files in This Item
Go to Link
Appears in
Collections
COLLEGE OF COMPUTING > ERICA 컴퓨터학부 > 1. Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Researcher Kim, Jung sun photo

Kim, Jung sun
ERICA 소프트웨어융합대학 (ERICA 컴퓨터학부)
Read more

Altmetrics

Total Views & Downloads

BROWSE