Cited 0 time in
Enhancing Genomic Data Representation through BERT-LSTM Hybrid Architecture
| DC Field | Value | Language |
|---|---|---|
| dc.contributor.author | Kim, Kyeong Ho | - |
| dc.contributor.author | Kim, Minji | - |
| dc.contributor.author | Kim, Sohui | - |
| dc.contributor.author | Lee, Jong-Min | - |
| dc.date.accessioned | 2026-05-27T00:30:38Z | - |
| dc.date.available | 2026-05-27T00:30:38Z | - |
| dc.date.issued | 2025-04 | - |
| dc.identifier.issn | 2169-3536 | - |
| dc.identifier.uri | https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/212861 | - |
| dc.description.abstract | This study proposes a novel approach for effective genetic sequence representation, focusing on the challenges of compressing and analyzing complex genomic data. We introduce a hybrid architecture that combines Bidirectional Encoder Representations from Transformers (BERT) with Long Short-Term Memory (LSTM) networks to generate comprehensive and compact gene embeddings. Our method processes genetic sequence data through k-mer tokenization and employs BERT to capture complex patterns, followed by LSTM to preserve essential sequential information while creating fixed-size representations. Using data from 623 participants from the Alzheimer's Disease Neuroimaging Initiative (ADNI) database, we analyzed genetic sequences across 10 genes to evaluate our approach. The effectiveness of our method is demonstrated through both visualization and quantitative evaluation. The t-distributed stochastic neighbor embedding (t-SNE) visualization revealed improved clustering of gene embeddings compared to traditional approaches, while our model achieved 82% accuracy in gene classification tasks. Our findings indicate that the combination of BERT and LSTM effectively captures both local and global genetic patterns while creating meaningful compressed representations, providing a promising framework for genetic sequence analysis. | - |
| dc.format.extent | 11 | - |
| dc.language | 영어 | - |
| dc.language.iso | ENG | - |
| dc.publisher | IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC | - |
| dc.title | Enhancing Genomic Data Representation through BERT-LSTM Hybrid Architecture | - |
| dc.type | Article | - |
| dc.publisher.location | 미국 | - |
| dc.identifier.doi | 10.1109/ACCESS.2025.3560282 | - |
| dc.identifier.scopusid | 2-s2.0-105003188956 | - |
| dc.identifier.wosid | 001483881100024 | - |
| dc.identifier.bibliographicCitation | IEEE ACCESS, v.13, pp 76497 - 76507 | - |
| dc.citation.title | IEEE ACCESS | - |
| dc.citation.volume | 13 | - |
| dc.citation.startPage | 76497 | - |
| dc.citation.endPage | 76507 | - |
| dc.type.docType | Article | - |
| dc.description.isOpenAccess | N | - |
| dc.description.journalRegisteredClass | scie | - |
| dc.description.journalRegisteredClass | scopus | - |
| dc.relation.journalResearchArea | Computer Science | - |
| dc.relation.journalResearchArea | Engineering | - |
| dc.relation.journalResearchArea | Telecommunications | - |
| dc.relation.journalWebOfScienceCategory | Computer Science, Information Systems | - |
| dc.relation.journalWebOfScienceCategory | Engineering, Electrical & Electronic | - |
| dc.relation.journalWebOfScienceCategory | Telecommunications | - |
| dc.subject.keywordPlus | GENETICS | - |
| dc.subject.keywordAuthor | Genetics | - |
| dc.subject.keywordAuthor | Genomics | - |
| dc.subject.keywordAuthor | Bioinformatics | - |
| dc.subject.keywordAuthor | Transformers | - |
| dc.subject.keywordAuthor | Long short term memory | - |
| dc.subject.keywordAuthor | Tokenization | - |
| dc.subject.keywordAuthor | Encoding | - |
| dc.subject.keywordAuthor | Bidirectional control | - |
| dc.subject.keywordAuthor | Data models | - |
| dc.subject.keywordAuthor | Sequences | - |
| dc.subject.keywordAuthor | BERT | - |
| dc.subject.keywordAuthor | gene embedding | - |
| dc.subject.keywordAuthor | LSTM | - |
| dc.subject.keywordAuthor | representation learning | - |
| dc.subject.keywordAuthor | SNP | - |
| dc.subject.keywordAuthor | tokenization | - |
| dc.identifier.url | https://ieeexplore.ieee.org/document/10964250 | - |
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.
222, Wangsimni-ro, Seongdong-gu, Seoul, 04763, Korea+82-2-2220-1366
COPYRIGHT © 2024 HANYANG UNIVERSITY.
Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.
