Discovering microbe-disease associations from the literature using a hierarchical long short-term memory network and an ensemble parser modelopen access
- Authors
- Park, Yesol; Lee, Joohong; Moon, Heesang; Choi, Yong Suk; Rho, Mina
- Issue Date
- Feb-2021
- Publisher
- NATURE RESEARCH
- Citation
- SCIENTIFIC REPORTS, v.11, no.1, pp.1 - 12
- Indexed
- SCIE
SCOPUS
- Journal Title
- SCIENTIFIC REPORTS
- Volume
- 11
- Number
- 1
- Start Page
- 1
- End Page
- 12
- URI
- https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/142311
- DOI
- 10.1038/s41598-021-83966-8
- ISSN
- 2045-2322
- Abstract
- With recent advances in biotechnology and sequencing technology, the microbial community has been intensively studied and discovered to be associated with many chronic as well as acute diseases. Even though a tremendous number of studies describing the association between microbes and diseases have been published, text mining methods that focus on such associations have been rarely studied. We propose a framework that combines machine learning and natural language processing methods to analyze the association between microbes and diseases. A hierarchical long short-term memory network was used to detect sentences that describe the association. For the sentences determined, two different parse tree-based search methods were combined to find the relation-describing word. The ensemble model of constituency parsing for structural pattern matching and dependency-based relation extraction improved the prediction accuracy. By combining deep learning and parse tree-based extractions, our proposed framework could extract the microbe-disease association with higher accuracy. The evaluation results showed that our system achieved an F-score of 0.8764 and 0.8524 in binary decisions and extracting relation words, respectively. As a case study, we performed a large-scale analysis of the association between microbes and diseases. Additionally, a set of common microbes shared by multiple diseases were also identified in this study. This study could provide valuable information for the major microbes that were studied for a specific disease. The code and data are available at https://github.com/DMnBI/mdi_predictor.
- Files in This Item
-
- Appears in
Collections - 서울 공과대학 > 서울 컴퓨터소프트웨어학부 > 1. Journal Articles
![qrcode](https://api.qrserver.com/v1/create-qr-code/?size=55x55&data=https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/142311)
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.