Layout Aware Semantic Element Extraction for Sustainable Science & Technology Decision Support
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Kim, Hyuntae | - |
dc.contributor.author | Choi, Jongyun | - |
dc.contributor.author | Park, Soyoung | - |
dc.contributor.author | Jung, Yuchul | - |
dc.date.accessioned | 2022-05-02T08:40:04Z | - |
dc.date.available | 2022-05-02T08:40:04Z | - |
dc.date.created | 2022-04-25 | - |
dc.date.issued | 2022-03 | - |
dc.identifier.issn | 2071-1050 | - |
dc.identifier.uri | https://scholarworks.bwise.kr/kumoh/handle/2020.sw.kumoh/21026 | - |
dc.description.abstract | New scientific and technological (S&T) knowledge is being introduced rapidly, and hence, analysis efforts to understand and analyze new published S&T documents are increasing daily. Automated text mining and vision recognition techniques alleviate the burden somewhat, but the various document layout formats and knowledge content granularities across the S&T field make it challenging. Therefore, this paper proposes LA-SEE (LAME and Vi-SEE), a knowledge graph construction framework that simultaneously extracts meta-information and useful image objects from S&T documents in various layout formats. We adopt Layout-aware Metadata Extraction (LAME), which can accurately extract metadata from various layout formats, and implement a transformer-based instance segmentation (i.e., Vision based Semantic Elements Extraction (Vi-SEE)) to maximize the vision-based semantic element recognition. Moreover, to constructing a scientific knowledge graph consisting of multiple S&T documents, we newly defined an extensible Semantic Elements Knowledge Graph (SEKG) structure. For now, we succeeded in extracting about 6 million semantic elements from 49,649 PDFs. In addition, to illustrate the potential power of our SEKG, we provide two promising application scenarios, such as a scientific knowledge guide across multiple S&T documents and questions and answering over scientific tables. | - |
dc.language | 영어 | - |
dc.language.iso | en | - |
dc.publisher | MDPI | - |
dc.title | Layout Aware Semantic Element Extraction for Sustainable Science & Technology Decision Support | - |
dc.type | Article | - |
dc.contributor.affiliatedAuthor | Kim, Hyuntae | - |
dc.contributor.affiliatedAuthor | Choi, Jongyun | - |
dc.contributor.affiliatedAuthor | Park, Soyoung | - |
dc.contributor.affiliatedAuthor | Jung, Yuchul | - |
dc.identifier.doi | 10.3390/su14052802 | - |
dc.identifier.wosid | 000771366700001 | - |
dc.identifier.bibliographicCitation | SUSTAINABILITY, v.14, no.5 | - |
dc.relation.isPartOf | SUSTAINABILITY | - |
dc.citation.title | SUSTAINABILITY | - |
dc.citation.volume | 14 | - |
dc.citation.number | 5 | - |
dc.type.rims | ART | - |
dc.type.docType | Article | - |
dc.description.journalClass | 1 | - |
dc.description.isOpenAccess | Y | - |
dc.description.journalRegisteredClass | scie | - |
dc.description.journalRegisteredClass | ssci | - |
dc.description.journalRegisteredClass | scopus | - |
dc.relation.journalResearchArea | Science & Technology - Other Topics | - |
dc.relation.journalResearchArea | Environmental Sciences & Ecology | - |
dc.relation.journalWebOfScienceCategory | Green & Sustainable Science & Technology | - |
dc.relation.journalWebOfScienceCategory | Environmental Sciences | - |
dc.relation.journalWebOfScienceCategory | Environmental Studies | - |
dc.subject.keywordAuthor | multi-modal | - |
dc.subject.keywordAuthor | document layout analysis | - |
dc.subject.keywordAuthor | metadata | - |
dc.subject.keywordAuthor | document structure | - |
dc.subject.keywordAuthor | document object | - |
dc.subject.keywordAuthor | semantic elements | - |
dc.subject.keywordAuthor | knowledge graph | - |
dc.subject.keywordAuthor | transformer | - |
dc.subject.keywordAuthor | decision support | - |
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.
350-27, Gumi-daero, Gumi-si, Gyeongsangbuk-do, Republic of Korea (39253)054-478-7170
COPYRIGHT 2020 Kumoh University All Rights Reserved.
Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.