AI-based nanotoxicity data extraction and prediction of nanotoxicity

Ha, Eunyong; Ha, Seung Min; Gerelkhuu, Zayakhuu; Kim, Hyun-Yi; Yoon, Tae Hyun

doi:10.1016/j.csbj.2025.03.052

Detailed Information

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

AI-based nanotoxicity data extraction and prediction of nanotoxicity

Full metadata record

DC Field	Value	Language
dc.contributor.author	Ha, Eunyong	-
dc.contributor.author	Ha, Seung Min	-
dc.contributor.author	Gerelkhuu, Zayakhuu	-
dc.contributor.author	Kim, Hyun-Yi	-
dc.contributor.author	Yoon, Tae Hyun	-
dc.date.accessioned	2025-12-22T06:30:30Z	-
dc.date.available	2025-12-22T06:30:30Z	-
dc.date.issued	2025-01	-
dc.identifier.issn	2001-0370	-
dc.identifier.issn	2001-0370	-
dc.identifier.uri	https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/209982	-
dc.description.abstract	With the growing use of nanomaterials (NMs), assessing their toxicity has become increasingly important. Among toxicity assessment methods, computational models for predicting nanotoxicity are emerging as alternatives to traditional in vitro and in vivo assays, which involve high costs and ethical concerns. As a result, the qualitative and quantitative importance of data is now widely recognized. However, collecting large, high-quality data is both time-consuming and labor-intensive. Artificial intelligence (AI)-based data extraction techniques hold significant potential for extracting and organizing information from unstructured text. However, the use of large language models (LLMs) and prompt engineering for nanotoxicity data extraction has not been widely studied. In this study, we developed an AI-based automated data extraction pipeline to facilitate efficient data collection. The automation process was implemented using Python-based LangChain. We used 216 nanotoxicity research articles as training data to refine prompts and evaluate LLM performance. Subsequently, the most suitable LLM with refined prompts was used to extract test data, from 605 research articles. As a result, data extraction performance on training data achieved F1D.E. (F1 score for Data Extraction) ranging from 84.6 % to 87.6 % across different LLMs. Furthermore, using the extracted dataset from test set, we constructed automated machine learning (AutoML) models that achieved F1N.P. (F1 score for Nanotoxicity Prediction) exceeding 86.1 % in predicting nanotoxicity. Additionally, we assessed the reliability and applicability of models by comparing them in terms of ground truth, size, and balance. This study highlights the potential of AI-based data extraction, representing a significant contribution to nanotoxicity research.	-
dc.format.extent	11	-
dc.language	영어	-
dc.language.iso	ENG	-
dc.publisher	Research Network of Computational and Structural Biotechnology	-
dc.title	AI-based nanotoxicity data extraction and prediction of nanotoxicity	-
dc.type	Article	-
dc.publisher.location	네델란드	-
dc.identifier.doi	10.1016/j.csbj.2025.03.052	-
dc.identifier.scopusid	2-s2.0-105002252903	-
dc.identifier.wosid	001483139600001	-
dc.identifier.bibliographicCitation	Computational and Structural Biotechnology Journal, v.29, pp 138 - 148	-
dc.citation.title	Computational and Structural Biotechnology Journal	-
dc.citation.volume	29	-
dc.citation.startPage	138	-
dc.citation.endPage	148	-
dc.type.docType	Article	-
dc.description.isOpenAccess	Y	-
dc.description.journalRegisteredClass	scie	-
dc.description.journalRegisteredClass	scopus	-
dc.relation.journalResearchArea	Biochemistry & Molecular Biology	-
dc.relation.journalResearchArea	Biotechnology & Applied Microbiology	-
dc.relation.journalWebOfScienceCategory	Biochemistry & Molecular Biology	-
dc.relation.journalWebOfScienceCategory	Biotechnology & Applied Microbiology	-
dc.subject.keywordPlus	NANOPARTICLES	-
dc.subject.keywordPlus	MODEL	-
dc.subject.keywordPlus	TOXICITY	-
dc.subject.keywordPlus	CLASSIFICATION	-
dc.subject.keywordPlus	NANOMATERIALS	-
dc.subject.keywordPlus	CURATION	-
dc.subject.keywordAuthor	Nanotoxicity	-
dc.subject.keywordAuthor	Large Language Models	-
dc.subject.keywordAuthor	Data extraction	-
dc.subject.keywordAuthor	Prompt engineering	-
dc.subject.keywordAuthor	LangChain	-
dc.subject.keywordAuthor	Automated machine learning	-
dc.identifier.url	https://www.sciencedirect.com/science/article/pii/S2001037025001175?via%3Dihub	-

Files in This Item: Go to Link

Appears in Collections: 서울 자연과학대학 > 서울 화학과 > 1. Journal Articles

Show simple item record

qrcode

Related Researcher

Researcher Yoon, Tae Hyun photo

Yoon, Tae Hyun: COLLEGE OF NATURAL SCIENCES (DEPARTMENT OF CHEMISTRY)

Read more

Altmetrics

Total Views & Downloads

RSS_1.0 RSS_2.0 ATOM_1.0

222, Wangsimni-ro, Seongdong-gu, Seoul, 04763, Korea+82-2-2220-1366

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Related Researcher

Altmetrics

Total Views & Downloads

BROWSE