Cooperative sequence clustering and decoding for DNA storage system with fountain codes
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Jeong, Jaeho | - |
dc.contributor.author | Park, Seong-Joon | - |
dc.contributor.author | Kim, Jae-Won | - |
dc.contributor.author | No, Jong-Seon | - |
dc.contributor.author | Jeon, Ha Hyeon | - |
dc.contributor.author | Lee, Jeong Wook | - |
dc.contributor.author | No, Albert | - |
dc.contributor.author | Kim, Sunghwan | - |
dc.contributor.author | Park, Hosung | - |
dc.date.accessioned | 2022-01-20T05:41:21Z | - |
dc.date.available | 2022-01-20T05:41:21Z | - |
dc.date.created | 2022-01-20 | - |
dc.date.issued | 2021-09-15 | - |
dc.identifier.issn | 1367-4803 | - |
dc.identifier.uri | https://scholarworks.bwise.kr/hongik/handle/2020.sw.hongik/24363 | - |
dc.description.abstract | Motivation: In DNA storage systems, there are tradeoffs between writing and reading costs. Increasing the code rate of error-correcting codes may save writing cost, but it will need more sequence reads for data retrieval. There is potentially a way to improve sequencing and decoding processes in such a way that the reading cost induced by this tradeoff is reduced without increasing the writing cost. In past researches, clustering, alignment and decoding processes were considered as separate stages but we believe that using the information from all these processes together may improve decoding performance. Actual experiments of DNA synthesis and sequencing should be performed because simulations cannot be relied on to cover all error possibilities in practical circumstances. Results: For DNA storage systems using fountain code and Reed-Solomon (RS) code, we introduce several techniques to improve the decoding performance. We designed the decoding process focusing on the cooperation of key components: Hamming-distance based clustering, discarding of abnormal sequence reads, RS error correction as well as detection and quality score-based ordering of sequences. We synthesized 513.6 KB data into DNA oligo pools and sequenced this data successfully with Illumina MiSeq instrument. Compared to Erlich's research, the proposed decoding method additionally incorporates sequence reads with minor errors which had been discarded before, and thus was able to make use of 10.6-11.9% more sequence reads from the same sequencing environment, this resulted in 6.5-8.9% reduction in the reading cost. Channel characteristics including sequence coverage and read-length distributions are provided as well. | - |
dc.language | 영어 | - |
dc.language.iso | en | - |
dc.publisher | OXFORD UNIV PRESS | - |
dc.subject | DIGITAL INFORMATION | - |
dc.subject | ROBUST | - |
dc.title | Cooperative sequence clustering and decoding for DNA storage system with fountain codes | - |
dc.type | Article | - |
dc.contributor.affiliatedAuthor | No, Albert | - |
dc.identifier.doi | 10.1093/bioinformatics/btab246 | - |
dc.identifier.scopusid | 2-s2.0-85133450710 | - |
dc.identifier.wosid | 000733827400007 | - |
dc.identifier.bibliographicCitation | BIOINFORMATICS, v.37, no.19, pp.3136 - 3143 | - |
dc.relation.isPartOf | BIOINFORMATICS | - |
dc.citation.title | BIOINFORMATICS | - |
dc.citation.volume | 37 | - |
dc.citation.number | 19 | - |
dc.citation.startPage | 3136 | - |
dc.citation.endPage | 3143 | - |
dc.type.rims | ART | - |
dc.type.docType | Article | - |
dc.description.journalClass | 1 | - |
dc.description.journalRegisteredClass | scie | - |
dc.description.journalRegisteredClass | scopus | - |
dc.relation.journalResearchArea | Biochemistry & Molecular Biology | - |
dc.relation.journalResearchArea | Biotechnology & Applied Microbiology | - |
dc.relation.journalResearchArea | Computer Science | - |
dc.relation.journalResearchArea | Mathematical & Computational Biology | - |
dc.relation.journalResearchArea | Mathematics | - |
dc.relation.journalWebOfScienceCategory | Biochemical Research Methods | - |
dc.relation.journalWebOfScienceCategory | Biotechnology & Applied Microbiology | - |
dc.relation.journalWebOfScienceCategory | Computer Science, Interdisciplinary Applications | - |
dc.relation.journalWebOfScienceCategory | Mathematical & Computational Biology | - |
dc.relation.journalWebOfScienceCategory | Statistics & Probability | - |
dc.subject.keywordPlus | DIGITAL INFORMATION | - |
dc.subject.keywordPlus | ROBUST | - |
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.
94, Wausan-ro, Mapo-gu, Seoul, 04066, Korea02-320-1314
COPYRIGHT 2020 HONGIK UNIVERSITY. ALL RIGHTS RESERVED.
Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.