Detailed Information

Cited 0 time in webofscience Cited 0 time in scopus
Metadata Downloads

Biomedical Flat and Nested Named Entity Recognition: Methods, Challenges, and Advancesopen access

Authors
Park, YesolSon, GyujinRho, Mina
Issue Date
Oct-2024
Publisher
MDPI
Keywords
named entity recognition; biomedical named entity recognition; flat named entity recognition; nested named entity recognition; flat and nested named entity recognition; natural language processing
Citation
Applied Sciences-basel, v.14, no.20, pp 1 - 23
Pages
23
Indexed
SCIE
SCOPUS
Journal Title
Applied Sciences-basel
Volume
14
Number
20
Start Page
1
End Page
23
URI
https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/197987
DOI
10.3390/app14209302
ISSN
2076-3417
2076-3417
Abstract
Biomedical named entity recognition (BioNER) aims to identify and classify biomedical entities (i.e., diseases, chemicals, and genes) from text into predefined classes. This process serves as an important initial step in extracting biomedical information from textual sources. Considering the structure of the entities it addresses, BioNER tasks are divided into two categories: flat NER, where entities are non-overlapping, and nested NER, which identifies entities embedded within another. While early studies primarily addressed flat NER, recent advances in neural models have enabled more sophisticated approaches to nested NER, gaining increasing relevance in the biomedical field, where entity relationships are often complex and hierarchically structured. This review, thus, focuses on the latest progress in large-scale pre-trained language model-based approaches, which have shown the significantly improved performance of NER. The state-of-the-art flat NER models have achieved average F1-scores of 84% on BC2GM, 89% on NCBI Disease, and 92% on BC4CHEM, while nested NER models have reached 80% on the GENIA dataset, indicating room for enhancement. In addition, we discuss persistent challenges, including inconsistencies of named entities annotated across different corpora and the limited availability of named entities of various entity types, particularly for multi-type or nested NER. To the best of our knowledge, this paper is the first comprehensive review of pre-trained language model-based flat and nested BioNER models, providing a categorical analysis among the methods and related challenges for future research and development in the field.
Files in This Item
Appears in
Collections
서울 공과대학 > 서울 컴퓨터소프트웨어학부 > 1. Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Researcher Rho, Mi na photo

Rho, Mi na
COLLEGE OF ENGINEERING (SCHOOL OF COMPUTER SCIENCE)
Read more

Altmetrics

Total Views & Downloads

BROWSE