Detailed Information

Cited 0 time in webofscience Cited 0 time in scopus
Metadata Downloads

Structure-guided sequence representation learning for generalizable protein function prediction

Full metadata record
DC Field Value Language
dc.contributor.authorOn, Seokjun-
dc.contributor.authorJeong, Yujin-
dc.contributor.authorKim, Eun-Sol-
dc.date.accessioned2025-11-13T02:00:14Z-
dc.date.available2025-11-13T02:00:14Z-
dc.date.issued2025-09-
dc.identifier.issn1367-4803-
dc.identifier.issn1367-4811-
dc.identifier.urihttps://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/209117-
dc.description.abstractMotivation Accurately predicting protein function from sequence remains a fundamental yet challenging goal in computational biology. Although recent advances have enabled the reliable prediction of protein 3D structures from sequences, utilizing structural information alone for functional inference has shown limited success. To address this gap, previous work has explored the integration of sequence and structural data by representing proteins as graphs, where residues are modeled as nodes, and spatial proximity defines edges. However, since the number of amino acids can vary significantly between proteins, the resulting graphs, constructed based on amino acids, also differ greatly in size. This large variation poses a challenge, as it becomes extremely difficult to extract generalizable information from graphs of such differing scales accurately. In this work, we propose Structure-guided Sequence Representation Learning, a novel framework that incorporates structural knowledge to extract informative, multiscale features directly from protein sequences. By embedding structural information into a sequence-based learning paradigm, our method captures functionally meaningful representations more effectively. Furthermore, we present a generalizable model architecture designed for multitask learning and inference, offering improved performance and flexibility over traditional task-specific approaches to protein function prediction.Results In this article, we demonstrate that the proposed novel attention pooling method on protein graphs effectively integrates global structural features and local chemical properties of amino acids in various-length proteins. Through this approach, we improve performance in tasks related to predicting protein functions, functional expression sites, and their relationships with structure and sequence. By effectively extracting the information needed to predict multiple protein functions simultaneously, we improve efficiency by eliminating the need for separate learning.Availability and implementation The code implementation is available at https://github.com/vanha9/S2RL_protein and has also been archived on zenodo: https://doi.org/10.5281/zenodo.16441001.-
dc.format.extent9-
dc.language영어-
dc.language.isoENG-
dc.publisherOxford University Press-
dc.titleStructure-guided sequence representation learning for generalizable protein function prediction-
dc.typeArticle-
dc.publisher.location영국-
dc.identifier.doi10.1093/bioinformatics/btaf511-
dc.identifier.scopusid2-s2.0-105017769205-
dc.identifier.wosid001583238100001-
dc.identifier.bibliographicCitationBioinformatics, v.41, no.9, pp 1 - 9-
dc.citation.titleBioinformatics-
dc.citation.volume41-
dc.citation.number9-
dc.citation.startPage1-
dc.citation.endPage9-
dc.type.docTypeArticle-
dc.description.isOpenAccessY-
dc.description.journalRegisteredClassscie-
dc.description.journalRegisteredClassscopus-
dc.relation.journalResearchAreaBiochemistry & Molecular Biology-
dc.relation.journalResearchAreaBiotechnology & Applied Microbiology-
dc.relation.journalResearchAreaComputer Science-
dc.relation.journalResearchAreaMathematical & Computational Biology-
dc.relation.journalResearchAreaMathematics-
dc.relation.journalWebOfScienceCategoryBiochemical Research Methods-
dc.relation.journalWebOfScienceCategoryBiotechnology & Applied Microbiology-
dc.relation.journalWebOfScienceCategoryComputer Science, Interdisciplinary Applications-
dc.relation.journalWebOfScienceCategoryMathematical & Computational Biology-
dc.relation.journalWebOfScienceCategoryStatistics & Probability-
dc.identifier.urlhttps://academic.oup.com/bioinformatics/article/41/9/btaf511/8253735-
Files in This Item
Go to Link
Appears in
Collections
서울 공과대학 > 서울 컴퓨터소프트웨어학부 > 1. Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Researcher Kim, Eun Sol photo

Kim, Eun Sol
COLLEGE OF ENGINEERING (SCHOOL OF COMPUTER SCIENCE)
Read more

Altmetrics

Total Views & Downloads

BROWSE