Detailed Information

Cited 0 time in webofscience Cited 0 time in scopus
Metadata Downloads

Layout Aware Semantic Element Extraction for Sustainable Science & Technology Decision Supportopen access

Authors
Kim, HyuntaeChoi, JongyunPark, SoyoungJung, Yuchul
Issue Date
Mar-2022
Publisher
MDPI
Keywords
multi-modal; document layout analysis; metadata; document structure; document object; semantic elements; knowledge graph; transformer; decision support
Citation
SUSTAINABILITY, v.14, no.5
Journal Title
SUSTAINABILITY
Volume
14
Number
5
URI
https://scholarworks.bwise.kr/kumoh/handle/2020.sw.kumoh/21026
DOI
10.3390/su14052802
ISSN
2071-1050
Abstract
New scientific and technological (S&T) knowledge is being introduced rapidly, and hence, analysis efforts to understand and analyze new published S&T documents are increasing daily. Automated text mining and vision recognition techniques alleviate the burden somewhat, but the various document layout formats and knowledge content granularities across the S&T field make it challenging. Therefore, this paper proposes LA-SEE (LAME and Vi-SEE), a knowledge graph construction framework that simultaneously extracts meta-information and useful image objects from S&T documents in various layout formats. We adopt Layout-aware Metadata Extraction (LAME), which can accurately extract metadata from various layout formats, and implement a transformer-based instance segmentation (i.e., Vision based Semantic Elements Extraction (Vi-SEE)) to maximize the vision-based semantic element recognition. Moreover, to constructing a scientific knowledge graph consisting of multiple S&T documents, we newly defined an extensible Semantic Elements Knowledge Graph (SEKG) structure. For now, we succeeded in extracting about 6 million semantic elements from 49,649 PDFs. In addition, to illustrate the potential power of our SEKG, we provide two promising application scenarios, such as a scientific knowledge guide across multiple S&T documents and questions and answering over scientific tables.
Files in This Item
Appears in
Collections
ETC > 1. Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Researcher JUNG, YU CHUL photo

JUNG, YU CHUL
College of Engineering (Department of Computer Engineering)
Read more

Altmetrics

Total Views & Downloads

BROWSE