Biomedical Flat and Nested Named Entity Recognition: Methods, Challenges, and Advancesopen access
- Authors
- Park, Yesol; Son, Gyujin; Rho, Mina
- Issue Date
- Oct-2024
- Publisher
- MDPI
- Keywords
- named entity recognition; biomedical named entity recognition; flat named entity recognition; nested named entity recognition; flat and nested named entity recognition; natural language processing
- Citation
- Applied Sciences-basel, v.14, no.20, pp 1 - 23
- Pages
- 23
- Indexed
- SCIE
SCOPUS
- Journal Title
- Applied Sciences-basel
- Volume
- 14
- Number
- 20
- Start Page
- 1
- End Page
- 23
- URI
- https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/197987
- DOI
- 10.3390/app14209302
- ISSN
- 2076-3417
2076-3417
- Abstract
- Biomedical named entity recognition (BioNER) aims to identify and classify biomedical entities (i.e., diseases, chemicals, and genes) from text into predefined classes. This process serves as an important initial step in extracting biomedical information from textual sources. Considering the structure of the entities it addresses, BioNER tasks are divided into two categories: flat NER, where entities are non-overlapping, and nested NER, which identifies entities embedded within another. While early studies primarily addressed flat NER, recent advances in neural models have enabled more sophisticated approaches to nested NER, gaining increasing relevance in the biomedical field, where entity relationships are often complex and hierarchically structured. This review, thus, focuses on the latest progress in large-scale pre-trained language model-based approaches, which have shown the significantly improved performance of NER. The state-of-the-art flat NER models have achieved average F1-scores of 84% on BC2GM, 89% on NCBI Disease, and 92% on BC4CHEM, while nested NER models have reached 80% on the GENIA dataset, indicating room for enhancement. In addition, we discuss persistent challenges, including inconsistencies of named entities annotated across different corpora and the limited availability of named entities of various entity types, particularly for multi-type or nested NER. To the best of our knowledge, this paper is the first comprehensive review of pre-trained language model-based flat and nested BioNER models, providing a categorical analysis among the methods and related challenges for future research and development in the field.
- Files in This Item
-
- Appears in
Collections - 서울 공과대학 > 서울 컴퓨터소프트웨어학부 > 1. Journal Articles

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.