Detailed Information

Cited 0 time in webofscience Cited 0 time in scopus
Metadata Downloads

Comparative Evaluation of Genome Assemblers from Long-Read Sequencing for Plants and Crops

Authors
Jung, HyungtaekJeon, Min-SeungHodgett, MatthewWaterhouse, PeterEyun, Seong-il
Issue Date
Jul-2020
Publisher
AMER CHEMICAL SOC
Keywords
plant genome; next-generation sequencing; Pacific Biosciences; long reads; nanopore; assemblers
Citation
JOURNAL OF AGRICULTURAL AND FOOD CHEMISTRY, v.68, no.29, pp 7670 - 7677
Pages
8
Journal Title
JOURNAL OF AGRICULTURAL AND FOOD CHEMISTRY
Volume
68
Number
29
Start Page
7670
End Page
7677
URI
https://scholarworks.bwise.kr/cau/handle/2019.sw.cau/53566
DOI
10.1021/acs.jafc.0c01647
ISSN
0021-8561
1520-5118
Abstract
The availability of recent state-of-the-art long-read sequencing technologies has significantly increased the ease and speed of producing high-quality plant genome assemblies. A wide variety of genome-related software tools are now available and they are typically benchmarked using microbial or model eukaryotic genomes such as Arabidopsis and rice. However, many plant species have much larger and more complex genomes than these, and the choice of tools, parameters, and/or strategies that can be used is not always obvious. Thus, we have compared the metrics of assemblies generated by various pipelines to discuss how assembly quality can be affected by two different assembly strategies. First, we focused on optimizing read preprocessing and assembler variables using eight different de novo assemblers on five different Pacific Biosciences long-read datasets of diploid and tetraploid species. Then, we examined a single scaffolding tool (quickmerge) that has been employed for the postprocessing step. We then merged the outputs from multiple assemblies to produce a higher quality consensus assembly. Then, we benchmarked the assemblies for completeness and accuracy (assembly metrics and BUSCO), computer memory, and CPU times. Two lightweight assemblers, Miniasm/Minimap/Racon and WTDBG, were deemed good for novice users because they involved smaller required learning curves and light computational resources. However, two heavyweight tools, CANU and Flye, should be the first choice when the goal is to achieve accurate and complete assemblies. Our results will provide valuable guidance in future plant genome projects and beyond.
Files in This Item
There are no files associated with this item.
Appears in
Collections
College of Natural Sciences > Department of Life Science > 1. Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Researcher Eyun, Seong Il photo

Eyun, Seong Il
자연과학대학 (생명과학과)
Read more

Altmetrics

Total Views & Downloads

BROWSE