Wuhan, the capital city of Hubei province in central China, experienced a pneumonia outbreak near the end of 2019. In the following months, several high-profile papers reported the etiological agent of this outbreak— a novel, zoonotic coronavirus now called severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) [1-3]. The rapid identification and genomic characterization of this pandemic virus was made possible by next-generation sequencing (NGS). With the rapid and continual spread of SARS-CoV-2, NGS is poised to increase diagnostic and surveillance efforts. This article discusses NGS diagnostics in the context of the coronavirus disease 2019 (COVID-19) pandemic response.
(Image source: shutterstock.com)
What Is Next-generation Sequencing?
NGS, also called massively parallel or deep sequencing, refers to a group of DNA sequencing technologies that enable simultaneous sequencing of up to billions of DNA fragments. The most popular NGS sequencers were developed by Illumina and include the MiSeq, HiSeq, and NovaSeq instruments . A typical Illumina workflow involves generating DNA fragments and attaching adapter sequences that permit the fragmented DNA to attach to a flow cell (i.e., library preparation); local amplification of flow cell-attached DNA fragments; and sequencing by synthesis using fluorescent nucleotides . The resulting optical readout produces overlapping DNA reads, which can be mapped onto existing genomes or assembled into new genomes .
Other notable NGS sequencers include BGI’s BGISEQ platform and the nanopore sequencers offered by Oxford Nanopore Technologies. BGI’s platform is similar to the Illumina sequencers except it locally amplifies DNA using a DNA origami strategy . Oxford nanopore sequencing, on the other hand, measures disruptions in electrical current across a protein nanopore mesh through which nucleic acids are electrophoresed . Specific DNA sequences disturb the electric current in distinct ways, allowing DNA sequences to be inferred from these disturbances.
All three of these NGS technologies were deployed to sequence the first SARS-CoV-2 genomes in the wake of the Wuhan outbreak [1-3]. Indeed, viral genomes are straightforward to assemble de novo using sequencing data because of their relatively small size .
What Samples Can Be Sequenced by NGS?
The earliest SARS-CoV-2 genomes were sequenced from cultured virus and, more directly, clinical samples (e.g., bronchoalveolar lavage fluid and throat swabs) [1-3]. Virus cultures are generated by infecting susceptible cells (e.g., human airway epithelial cells from an uninfected individual) with supernatants isolated from infected patient samples, after which shed viral supernatants are collected . Virus cultures require less sequencing depth than clinical samples to provide the same coverage because they contain higher levels of virus. Although this was useful for the initial discovery of SARS-CoV-2, virus cultures are too time-consuming to make this a viable strategy for clinical diagnosis now that the etiological agent of COVID-19 is known.
Shotgun Metagenomics Versus Targeted Sequencing
Two general NGS approaches can be used to diagnose SARS-CoV-2 in clinical samples: shotgun metagenomics and targeted sequencing. Whereas shotgun metagenomics sequences the entire genetic content of a sample, targeted sequencing samples the most relevant genetic material for sequencing. Targeted sequencing can be performed either by hybridization capture or PCR amplification of viral sequences.
The wholesale sequencing of clinical samples is called shotgun metagenomics . Using this workflow, genetic material from the host and all the microbial species in a clinical sample can be sequenced simultaneously, enabling the identification of co-infections and host immune responses. This method is also hypothesis-free, making it the preferred choice for identifying novel pathogens. However, the shotgun approach also requires high sequencing depth due to the relatively low levels of viral genomic sequences relative to total nucleic acid content in clinical samples .
The sequencing requirements of clinical samples can be reduced by selecting only relevant genetic sequences for sequencing . One way to accomplish this is by hybridization capture, which uses biotinylated beads to capture target DNA. Following hybridization, these beads can be magnetically isolated, effectively increasing the concentration of target DNA.
Pathogen-specific sequences can also be enriched by PCR amplification prior to NGS, called amplicon sequencing. Although generally cheaper than the hybridization capture approach, PCR amplification can introduce mutations in sequenced amplicons. Moreover, this approach generates shorter fragments than hybridization capture.
The differences between shotgun metagenomic and targeted sequencing approaches are illustrated by Illumina’s newly developed NGS workflows for SARS-CoV-2 detection. Whereas the shotgun metagenomic approach requires at least 10 million reads to detect SARS-CoV-2 using the NovaSeq instrument , target enrichment dramatically reduces the necessary sequencing depth to only 500,000 reads .
Why Use NGS to Diagnose SARS-CoV-2?
A key strength of NGS diagnostics is the ability to test up to thousands of patient samples simultaneously. For example, Illumina’s NovaSeq 6000 sequencer can test up to 3,072 samples for SARS-CoV-2 in just 24 hours . The throughput of current diagnostic approaches pales in comparison.
NGS diagnostics can also be inexpensive and highly sensitive. Researchers at Baylor College of Medicine developed a targeted NGS workflow, called Pathogen-Oriented Low-Cost Assembly & Re-Sequencing (POLAR), that can test up to 192 samples in 24 hours . POLAR costs $30 per patients and has a limit of detection of just 84 genome equivalents per milliliter — lower than most existing SARS-CoV-2 diagnostics .
In addition to enabling higher diagnostic throughput, NGS can inform clinicians and policymakers about the genetic evolution and spread of SARS-CoV-2. For example, researchers at Mount Sinai discovered that multiple strains of SARS-CoV-2 contributed to the New York City SARS-CoV-2 outbreak by analyzing 84 SARS-CoV-2 genomes from clinical isolates . Their analysis revealed strains with distinct origins — some from other parts of the US, others from Europe — as well as community spread, identified by clusters of genetically similar SARS-CoV-2 isolates .
Despite its many advantages, NGS diagnostics require more technical expertise to carry out than existing SARS-CoV-2 diagnostics. The technical demands of NGS testing also mean that clinical samples must be shipped to a laboratory, which could allow for sample degradation and delayed turnaround times. This contrasts with point-of-care tests that can be performed immediately after sample collection at primary care facilities (e.g., isothermal nucleic acid tests and lateral flow-based antibody tests). Finally, while targeted sequencing can make NGS cheaper as described above, these tests are still more expensive than PCR tests . In addition to these limitations, NGS diagnostics only recently received regulatory approval for clinical use [7, 8].
Overall, although very powerful, NGS SARS-CoV-2 diagnostics are more likely to supplement, not replace, PCR and antibody-based SARS-CoV-2 diagnostics.
1. Zhu N, et al. A Novel Coronavirus from Patients with Pneumonia in China, 2019. N Engl J Med. 2020;382(8):727-33.
2. Zhou P, et al. A pneumonia outbreak associated with a new coronavirus of probable bat origin. Nature. 2020;579(7798):270-3.
3. Lu R, et al. Genomic characterisation and epidemiology of 2019 novel coronavirus: implications for virus origins and receptor binding. Lancet. 2020;395(10224):565-74.
4. Gu W, et al. Clinical Metagenomic Next-Generation Sequencing for Pathogen Detection. Annu Rev Pathol. 2019;14:319-38.
5. Radford AD, et al. Application of next-generation sequencing technologies in virology. J Gen Virol. 2012;93(Pt 9):1853-68.
6. Kiselev D, et al. Current Trends in Diagnostics of Viral Infections of Unknown Etiology. Viruses. 2020;12(2).
7. Illumina (2020). Comprehensive workflow for detecting coronavirus using Illumina benchtop systems. Accessed June 27, 2020.
8. Illumina (2020). Enrichment workflow for detecting coronavirus using Illumina NGS systems. Accessed June 27, 2020.
9. Glenn St Hilaire B, et al. A rapid, low cost, and highly sensitive SARS-CoV-2 diagnostic based on whole genome sequencing. bioRxiv. 2020. doi: 10.1101/2020.04.25.061499.
10. Gonzalez-Reiche AS, et al. Introductions and early spread of SARS-CoV-2 in the New York City area. Science. 2020.