Splicing signature database development to delineate cancer pathways using literature mining and transcriptome machine learning
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Lee, Kyubin | - |
dc.contributor.author | Hyung, Daejin | - |
dc.contributor.author | Cho, Soo Young | - |
dc.contributor.author | Yu, Namhee | - |
dc.contributor.author | Hong, Sewha | - |
dc.contributor.author | Kim, Jihyun | - |
dc.contributor.author | Kim, Sunshin | - |
dc.contributor.author | Han, Ji-Youn | - |
dc.contributor.author | Park, Charny | - |
dc.date.accessioned | 2023-05-03T09:33:46Z | - |
dc.date.available | 2023-05-03T09:33:46Z | - |
dc.date.issued | 2023-03 | - |
dc.identifier.issn | 2001-0370 | - |
dc.identifier.uri | https://scholarworks.bwise.kr/erica/handle/2021.sw.erica/112553 | - |
dc.description.abstract | Alternative splicing (AS) events modulate certain pathways and phenotypic plasticity in cancer. Although previous studies have computationally analyzed splicing events, it is still a challenge to uncover biological functions induced by reliable AS events from tremendous candidates. To provide essential splicing event signatures to assess pathway regulation, we developed a database by collecting two datasets: (i) reported literature and (ii) cancer transcriptome profile. The former includes knowledge-based splicing signatures collected from 63,229 PubMed abstracts using natural language processing, extracted for 202 pathways. The latter is the machine learning-based splicing signatures identified from pan-cancer transcriptome for 16 cancer types and 42 pathways. We established six different learning models to classify pathway activities from splicing profiles as a learning dataset. Top-ranked AS events by learning model feature importance became the signature for each pathway. To validate our learning results, we performed evaluations by (i) performance metrics, (ii) differential AS sets acquired from external datasets, and (iii) our knowledge-based signatures. The area under the receiver operating characteristic values of the learning models did not exhibit any drastic difference. However, random-forest distinctly presented the best performance to compare with the AS sets identified from external datasets and our knowledge-based signatures. Therefore, we used the signatures obtained from the random-forest model. Our database provided the clinical characteristics of the AS signatures, including survival test, molecular subtype, and tumor microenvironment. The regulation by splicing factors was additionally investigated. Our database for developed signatures supported retrieval and visualization system.(c) 2023 The Authors. Published by Elsevier B.V. on behalf of Research Network of Computational and Structural Biotechnology. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/). | - |
dc.format.extent | 11 | - |
dc.language | 영어 | - |
dc.language.iso | ENG | - |
dc.publisher | Research Network of Computational and Structural Biotechnology | - |
dc.title | Splicing signature database development to delineate cancer pathways using literature mining and transcriptome machine learning | - |
dc.type | Article | - |
dc.publisher.location | 네델란드 | - |
dc.identifier.doi | 10.1016/j.csbj.2023.02.052 | - |
dc.identifier.scopusid | 2-s2.0-85149814197 | - |
dc.identifier.wosid | 000955034200001 | - |
dc.identifier.bibliographicCitation | Computational and Structural Biotechnology Journal, v.21, pp 1978 - 1988 | - |
dc.citation.title | Computational and Structural Biotechnology Journal | - |
dc.citation.volume | 21 | - |
dc.citation.startPage | 1978 | - |
dc.citation.endPage | 1988 | - |
dc.type.docType | Article | - |
dc.description.isOpenAccess | Y | - |
dc.description.journalRegisteredClass | scie | - |
dc.description.journalRegisteredClass | scopus | - |
dc.relation.journalResearchArea | Biochemistry & Molecular Biology | - |
dc.relation.journalResearchArea | Biotechnology & Applied Microbiology | - |
dc.relation.journalWebOfScienceCategory | Biochemistry & Molecular Biology | - |
dc.relation.journalWebOfScienceCategory | Biotechnology & Applied Microbiology | - |
dc.subject.keywordPlus | ESRP1 | - |
dc.subject.keywordPlus | GENES | - |
dc.subject.keywordPlus | ATLAS | - |
dc.subject.keywordAuthor | Machine -learning | - |
dc.subject.keywordAuthor | Alternative splicing | - |
dc.subject.keywordAuthor | Tumor transcriptome | - |
dc.subject.keywordAuthor | Database | - |
dc.subject.keywordAuthor | Gene signature | - |
dc.identifier.url | https://www.sciencedirect.com/science/article/pii/S2001037023000983?via%3Dihub | - |
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.
55 Hanyangdeahak-ro, Sangnok-gu, Ansan, Gyeonggi-do, 15588, Korea+82-31-400-4269 sweetbrain@hanyang.ac.kr
COPYRIGHT © 2021 HANYANG UNIVERSITY. ALL RIGHTS RESERVED.
Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.