한국어 중첩 개체명 분석을 위한 연구The analysis of the nested structure of named entities in Korea
- Authors
- 송영숙; 정유남; 유현조
- Issue Date
- 2022
- Publisher
- 한국어의미학회
- Keywords
- 개체명; 개체명 인식; 중첩 개체명; 복합 개체명; 정보 추출; 개체명 주석; 개체명 경계 탐지; 최장 개체명; 최단 개체명; 자연어 처리; named entity; named entity recognition; nested named entity; complex named entity; information extraction; named entity annotation; named entity boundary detection; longest named entity; shortest named entity; natural language process
- Citation
- 한국어 의미학, v.76, pp 66 - 101
- Pages
- 36
- Journal Title
- 한국어 의미학
- Volume
- 76
- Start Page
- 66
- End Page
- 101
- URI
- https://scholarworks.bwise.kr/cau/handle/2019.sw.cau/61840
- ISSN
- 1226-7198
2734-0171
- Abstract
- This paper analyzes the hierarchical structures of named entities in the NIKL Named Entity Corpus, which is annotated with 553,830 flat named entity tags. This study will be a base for developing a method to build a Korean nested named entity corpus. The flat version of named entity recognition identifies mentions as linear spans.
The nested named entity approach analyzes the hierarchical internal structure of named entities which may consist of smaller component named entities. We extracted candidate mentions for the nested named entity analysis from the NIKL Named Entity Corpus and classified them into three categories: serial named entities, complex named entities, and phrases with a named entity head. These candidates were reviewed manually to be selected as the target of nested named entity analysis. Finally, we discussed the span and the internal structure of named entities and proposed principles and guidelines for the construction of the Korean nested named entity corpus
- Files in This Item
- There are no files associated with this item.
- Appears in
Collections - The Office of Research Affairs > National Project Research Center > 1. Journal Articles
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.