By Jianqiang Zhang, Ganwu Li, Kyoung-Jin Yoon, Phillip Gauger, Karen Harmon and Rodger Main, Iowa State University College of Veterinary Medicine Department of Veterinary Diagnostic and Production Animal Medicine
Nucleic acid sequencing is the process of determining the precise order of nucleotides (A, T, C and G) within a DNA molecule. For an RNA molecule, its sequence can be determined by first making a complementary DNA (cDNA) molecule via reverse transcription. Great progress has been made in sequencing technologies over the past 10 years. This column briefly reviews the history and advancement of DNA sequencing technology and then focuses on discussing application of Sanger sequencing and next-generation sequencing technologies in swine diagnostic medicine, particularly at the Iowa State University Veterinary Diagnostic Laboratory.
History and advancement of DNA sequencing technology
In the mid-1970s, two first-generation sequencing methods were developed and one of them (Sanger sequencing) was widely accepted for DNA sequencing. From the 1970s to the early 2000s, Sanger sequencing remained as the predominant method for DNA/cDNA sequencing and was commercialized with advancement in instrument and automation. Even now, this method is still used by many diagnostic and research laboratories.
However, this method has some disadvantages: 1) mainly for sequencing known pathogens; 2) low throughput; 3) labor-intensive, time-consuming and costly to determine whole genome sequences of pathogens with large genomes.
Some novel sequencing techniques different from the Sanger method were developed in the 2000s and are referred to as next-generation sequencing or second-generation sequencing. There are several different NGS/SGS technologies and platforms. For example, 1) Roche 454 sequencing (pyrosequencing); 2) Genome Analyzer, HiSeq, MiSeq and NextSeq platforms from Solexa/Illumina Co.; 3) SOLiD and Ion Torrent platforms from Life Technologies Co. One common feature of NGS platforms is massively parallel sequencing that can generate millions to billions of sequence reads in a single run. Although each nucleotide sequence read generated by NGS is relatively short, capability of obtaining so many sequence reads at the same time makes NGS a powerful tool to determine large genome sequences at lower cost and faster speed.
In addition, a hypothesis-free manner of NGS provides an unbiased approach for detecting multiple agents in a sample and for discovering novel microorganisms. However, NGS/SGS still has some limitations in speed, read length and detecting epigenetic markers.
Some sequencing technologies different than second-generation platforms were described as “third-generation” sequencing in 2008-09. Currently there are mainly two companies at the heart of third-generation sequencing technology development: Pacific Biosciences and Oxford Nanopore Technologies.
Third-generation sequencing technologies have obvious advantages including longer sequence reads, faster sequencing speed, portability (small size and portable sequencing machines), and potential capability of direct detection of epigenetic markers. However, third-generation sequencing technologies are still under active development due to higher error rates compared to NGS/SGS.
Application of Sanger sequencing and NGS/SGS in swine diagnostic medicine
Sanger sequencing is still widely used in veterinary diagnostics. In recent years, NGS/SGS has also been introduced to veterinary diagnostics. Below we discuss the application of Sanger sequencing and NGS/SGS in swine diagnostic medicine using the sequencing service offered at ISU VDL as examples.
Application of Sanger sequencing
If swine veterinarians want to sequence one specific gene of an agent at lower cost with faster turnaround time for sequence identity comparison and phylogenetic (dendrogram) analysis, Sanger sequencing is the choice. Some examples of gene targets and agents are: ORF5 (GP5) gene for porcine reproductive and respiratory syndrome virus, HA gene for influenza A virus, ORF2 (capsid) gene for porcine circovirus 2, VP7 (outer capsid) gene for porcine rotaviruses (A, B, C), spike protein gene for porcine epidemic diarrhea virus (PEDV) and porcine deltacoronavirus, VP1 for Seneca Valley virus, HN and F genes for porcine parainfluenza virus 1, P146 adhesion gene for Mycoplasma hyopneumoniae, and so on. The sequencing cost is $110-$200 per gene with a turnaround time of two to seven days.
Application of NGS/SGS
Under various circumstances, NGS/SGS is a better choice than Sanger sequencing is.
Whole genome sequencing of viruses
If whole genome sequences of viruses are needed, NGS/SGS can achieve the goal at lower cost with shorter turnaround time compared to Sanger method. ISU VDL has developed NGS procedures for successfully determining whole genome sequences of various RNA viruses, such as PRRSV, PEDV, PDCoV, transmissible gastroenteritis virus, porcine hemagglutinating encephalomyelitis virus, influenza viruses, porcine rotaviruses, SVV, PPIV-1, porcine teschovirus, porcine sapelovirus, porcine kobuvirus, porcine pasivirus, pestivirus, porcine astrovirus, porcine pegivirus, etc., in the form of cell culture isolates and/or clinical samples.
Our random primers-based NGS with 24 samples multiplexed in one sequencing reaction on a MiSeq platform can generally obtain whole genome sequences of RNA viruses from clinical samples with CT<25; for clinical samples with CT>25, success rate of obtaining whole genome sequences could be lower but it is still possible to obtain partial genomic sequences. It is noteworthy that, if fewer samples were to be multiplexed per run, the success rate of obtaining whole genome sequences could be higher, but the cost per sample would increase accordingly.
The current NGS procedures in our laboratory have achieved favorable outcomes for sequencing DNA viruses with relatively small genome sizes (<10 kb) such as porcine circovirus and porcine parvovirus, but challenges still exist to obtain whole genome sequences of DNA viruses with large genome from clinical samples. Additional work is needed to improve this.
The cost of NGS for whole genome sequencing of viruses is $300 per sample with turnaround time of two to four weeks.
Identification of multiple agents or more than one virus strain in samples
Unbiased NGS technologies hold the promise of identifying multiple agents in a single sample. For example, co-infection of PEDV with other viruses such as porcine astrovirus, enterovirus G, porcine sapelovirus, porcine kobuvirus, porcine sapovirus, PDCoV, and so on has been identified by NGS in enteric samples. In addition, NGS has the potential to detect and distinguish more than one strain of a virus in a sample. For example, in PRRSV-vaccinated swine, sometimes veterinarians observe clinical signs and they often request to test whether a field PRRSV strain is present in the sample in addition to vaccine strains. Sanger sequencing is not always capable of answering this question.
In contrast, NGS together with some additional sequence analysis software may be capable of revealing the presence of more than one PRRSV strain in the sample as we have demonstrated in a sample containing two type 2 PRRSV isolates (i.e. Ingelvac PRRS MLV and VR-2385). However, additional work is needed to determine the capability of NGS to detect and distinguish various commercial PRRSV-2 vaccine strains from each predominant lineage of PRRSV-2 field strains when they are concurrently present in a single sample. The cost of NGS for identifying mixed infections is $400 per sample with turnaround time of two to four weeks.
Detection of novel or previously unrecognized agents in clinical samples
The metagenomics-based NGS strategy (i.e. hypothesis-free) is able to detect multiple agents, including potential novel pathogens, in a sample without prior knowledge of a specific target. If swine veterinarians have undiagnosed cases and would like to do further investigations, NGS can be a choice. In recent years, a number of novel swine viruses, e.g. porcine circovirus 3, influenza D virus, PPIV-1, atypical porcine pestivirus associated with congenital tremor, and porcine astrovirus 3 associated with neurological diseases have been identified in U.S. swine using the NGS approach. The cost of NGS for discovering novel or previously unrecognized pathogens is $400 per sample with turnaround time of two to four weeks.
Determination of bacterial virulence, resistance and type profiling
NGS-based whole genome sequencing can also be used for the detection and identification, genomic epidemiological study, virulome and antibiotic resistome mining of bacteria. Please contact the laboratory for information about this service.
Nucleic acid sequencing technologies have progressed rapidly in recent years and continue to advance or be improved. Although Sanger method is still commonly used for sequencing specific genes, NGS technologies have been increasingly used in veterinary diagnostics with a good turnaround time and cost-saving. Specifically, in swine diagnostic medicine, NGS technologies have been used for determining the whole genome sequences of viruses and bacteria, identifying mixed infections, discovering novel or previously unrecognized agents, and for various applications in characterization of bacteria phenotypes or biotypes.
However, it should be noted that NGS technology is a method of detection that may not confirm the pathogenicity of detected agent(s); therefore, aligning NGS results with the clinical context of the case and with other diagnostic assays is of the utmost importance.