Cited 0 time in
Structure-guided sequence representation learning for generalizable protein function prediction
| DC Field | Value | Language |
|---|---|---|
| dc.contributor.author | On, Seokjun | - |
| dc.contributor.author | Jeong, Yujin | - |
| dc.contributor.author | Kim, Eun-Sol | - |
| dc.date.accessioned | 2025-11-13T02:00:14Z | - |
| dc.date.available | 2025-11-13T02:00:14Z | - |
| dc.date.issued | 2025-09 | - |
| dc.identifier.issn | 1367-4803 | - |
| dc.identifier.issn | 1367-4811 | - |
| dc.identifier.uri | https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/209117 | - |
| dc.description.abstract | Motivation Accurately predicting protein function from sequence remains a fundamental yet challenging goal in computational biology. Although recent advances have enabled the reliable prediction of protein 3D structures from sequences, utilizing structural information alone for functional inference has shown limited success. To address this gap, previous work has explored the integration of sequence and structural data by representing proteins as graphs, where residues are modeled as nodes, and spatial proximity defines edges. However, since the number of amino acids can vary significantly between proteins, the resulting graphs, constructed based on amino acids, also differ greatly in size. This large variation poses a challenge, as it becomes extremely difficult to extract generalizable information from graphs of such differing scales accurately. In this work, we propose Structure-guided Sequence Representation Learning, a novel framework that incorporates structural knowledge to extract informative, multiscale features directly from protein sequences. By embedding structural information into a sequence-based learning paradigm, our method captures functionally meaningful representations more effectively. Furthermore, we present a generalizable model architecture designed for multitask learning and inference, offering improved performance and flexibility over traditional task-specific approaches to protein function prediction.Results In this article, we demonstrate that the proposed novel attention pooling method on protein graphs effectively integrates global structural features and local chemical properties of amino acids in various-length proteins. Through this approach, we improve performance in tasks related to predicting protein functions, functional expression sites, and their relationships with structure and sequence. By effectively extracting the information needed to predict multiple protein functions simultaneously, we improve efficiency by eliminating the need for separate learning.Availability and implementation The code implementation is available at https://github.com/vanha9/S2RL_protein and has also been archived on zenodo: https://doi.org/10.5281/zenodo.16441001. | - |
| dc.format.extent | 9 | - |
| dc.language | 영어 | - |
| dc.language.iso | ENG | - |
| dc.publisher | Oxford University Press | - |
| dc.title | Structure-guided sequence representation learning for generalizable protein function prediction | - |
| dc.type | Article | - |
| dc.publisher.location | 영국 | - |
| dc.identifier.doi | 10.1093/bioinformatics/btaf511 | - |
| dc.identifier.scopusid | 2-s2.0-105017769205 | - |
| dc.identifier.wosid | 001583238100001 | - |
| dc.identifier.bibliographicCitation | Bioinformatics, v.41, no.9, pp 1 - 9 | - |
| dc.citation.title | Bioinformatics | - |
| dc.citation.volume | 41 | - |
| dc.citation.number | 9 | - |
| dc.citation.startPage | 1 | - |
| dc.citation.endPage | 9 | - |
| dc.type.docType | Article | - |
| dc.description.isOpenAccess | Y | - |
| dc.description.journalRegisteredClass | scie | - |
| dc.description.journalRegisteredClass | scopus | - |
| dc.relation.journalResearchArea | Biochemistry & Molecular Biology | - |
| dc.relation.journalResearchArea | Biotechnology & Applied Microbiology | - |
| dc.relation.journalResearchArea | Computer Science | - |
| dc.relation.journalResearchArea | Mathematical & Computational Biology | - |
| dc.relation.journalResearchArea | Mathematics | - |
| dc.relation.journalWebOfScienceCategory | Biochemical Research Methods | - |
| dc.relation.journalWebOfScienceCategory | Biotechnology & Applied Microbiology | - |
| dc.relation.journalWebOfScienceCategory | Computer Science, Interdisciplinary Applications | - |
| dc.relation.journalWebOfScienceCategory | Mathematical & Computational Biology | - |
| dc.relation.journalWebOfScienceCategory | Statistics & Probability | - |
| dc.identifier.url | https://academic.oup.com/bioinformatics/article/41/9/btaf511/8253735 | - |
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.
222, Wangsimni-ro, Seongdong-gu, Seoul, 04763, Korea+82-2-2220-1366
COPYRIGHT © 2024 HANYANG UNIVERSITY.
Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.
