Detailed Information

Cited 0 time in webofscience Cited 0 time in scopus
Metadata Downloads

LAME: Layout-Aware Metadata Extraction Approach for Research Articles

Full metadata record
DC Field Value Language
dc.contributor.authorChoi, Jongyun-
dc.contributor.authorKong, Hyesoo-
dc.contributor.authorYoon, Hwamook-
dc.contributor.authorOh, Heungseon-
dc.contributor.authorJung, Yuchul-
dc.date.accessioned2022-05-17T02:05:46Z-
dc.date.available2022-05-17T02:05:46Z-
dc.date.created2022-05-17-
dc.date.issued2022-03-
dc.identifier.issn1546-2218-
dc.identifier.urihttps://scholarworks.bwise.kr/kumoh/handle/2020.sw.kumoh/21096-
dc.description.abstractThe volume of academic literature, such as academic conference papers and journals, has increased rapidly worldwide, and research on metadata extraction is ongoing. However, high-performing metadata extraction is still challenging due to diverse layout formats according to journal publishers. To accommodate the diversity of the layouts of academic journals, we propose a novel LAyout-aware Metadata Extraction (LAME) framework equipped with the three characteristics (e.g., design of automatic layout analysis, construction of a large meta-data training set, and implementation of metadata extractor). In the framework, we designed an automatic layout analysis using PDFMiner. Based on the layout analysis, a large volume of metadata-separated training data, including the title, abstract, author name, author affiliated organization, and keywords, were automatically extracted. Moreover, we constructed a pre-trained model, Layout-MetaBERT, to extract the metadata from academic journals with varying layout formats. The experimental results with our metadata extractor exhibited robust performance (Macro-F1, 93.27%) in metadata extraction for unseen journals with different layout formats.-
dc.language영어-
dc.language.isoen-
dc.publisherTECH SCIENCE PRESS-
dc.titleLAME: Layout-Aware Metadata Extraction Approach for Research Articles-
dc.typeArticle-
dc.contributor.affiliatedAuthorChoi, Jongyun-
dc.contributor.affiliatedAuthorJung, Yuchul-
dc.identifier.doi10.32604/cmc.2022.025711-
dc.identifier.wosid000779567700001-
dc.identifier.bibliographicCitationCMC-COMPUTERS MATERIALS & CONTINUA, v.72, no.2, pp.4019 - 4037-
dc.relation.isPartOfCMC-COMPUTERS MATERIALS & CONTINUA-
dc.citation.titleCMC-COMPUTERS MATERIALS & CONTINUA-
dc.citation.volume72-
dc.citation.number2-
dc.citation.startPage4019-
dc.citation.endPage4037-
dc.type.rimsART-
dc.type.docTypeArticle-
dc.description.journalClass1-
dc.description.isOpenAccessY-
dc.description.journalRegisteredClassscie-
dc.description.journalRegisteredClassscopus-
dc.relation.journalResearchAreaComputer Science-
dc.relation.journalResearchAreaMaterials Science-
dc.relation.journalWebOfScienceCategoryComputer Science, Information Systems-
dc.relation.journalWebOfScienceCategoryMaterials Science, Multidisciplinary-
dc.subject.keywordAuthorAutomatic layout analysis-
dc.subject.keywordAuthorlayout-MetaBERT-
dc.subject.keywordAuthormetadata extrac-tion-
dc.subject.keywordAuthorresearch article-
Files in This Item
There are no files associated with this item.
Appears in
Collections
ETC > 1. Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Researcher JUNG, YU CHUL photo

JUNG, YU CHUL
College of Engineering (Department of Computer Engineering)
Read more

Altmetrics

Total Views & Downloads

BROWSE