Assessing the impact of assemblers on virus detection in a de novo metagenomic analysis pipeline

J. White, Daniel; Wang, Jing; Hall, Richard J.

doi:10.26091/ESRNZ.7973993.v1

Assessing the Impact of Assemblers on Virus Detection.pdf (126.77 kB)

Assessing the impact of assemblers on virus detection in a de novo metagenomic analysis pipeline

journal contribution

posted on 2019-04-10, 04:13 authored by Daniel J. White, Jing Wang, Richard J. Hall

Applying high-throughput sequencing to pathogen discovery is a relatively new field, the objective of which is to find disease-causing agents when little or no background information on disease is available. Key steps in the process are the generation of millions of sequence reads from an infected tissue sample, followed by assembly of these reads into longer, contiguous stretches of nucleotide sequences, and then identification of the contigs by matching them to known databases, such as those stored at GenBank or Ensembl. This technique, that is, de novo metagenomics, is particularly useful when the pathogen is viral and strong discriminatory power can be achieved. However, recently, we found that striking differences in results can be achieved when different assemblers were used. In this study, we test formally the impact of five popular assemblers (MIRA, VELVET, METAVELVET, SPADES, and OMEGA) on the detection of a novel virus and assembly of its whole genome in a data set for which we have confirmed the presence of the virus by empirical laboratory techniques, and compare the overall performance between assemblers. Our results show that if results from only one assembler are considered, biologically important reads can easily be overlooked. The impacts of these results on the field of pathogen discovery are considered.

Funding

New Zealand eScience Infrastructure (NeSI)

Ministry of Business, Innovation and Employment (MBIE) - Crown Research Institute Capability Fund

History

Usage metrics

Keywords

Algorithms Assemblers De novo metagenomics Pathogen Discovery Test New Zealand Infectious Diseases Molecular Biology

Licence

CC BY-NC-SA 4.0

Exports

RefWorks

BibTeX

Ref. manager

Endnote

DataCite

NLM

DC

Assessing the impact of assemblers on virus detection in a de novo metagenomic analysis pipeline

Funding

New Zealand eScience Infrastructure (NeSI)

Ministry of Business, Innovation and Employment (MBIE) - Crown Research Institute Capability Fund

History

Usage metrics

Categories

Keywords

Licence

Exports