Previous Article | Next Article ![]()
Journal of Virology, May 2009, p. 4642-4651, Vol. 83, No. 9
0022-538X/09/$08.00+0 doi:10.1128/JVI.02301-08
Copyright © 2009, American Society for Microbiology. All Rights Reserved.

Blood Systems Research Institute, San Francisco, California 94118,1 Department of Laboratory Medicine, University of California, San Francisco, San Francisco, California 94118,2 Stanford Genome Technology Center, Stanford, California,3 National Institute of Health, Department of Virology, Islamabad, Pakistan4
Received 3 November 2008/ Accepted 2 February 2009
|
|
|---|
|
|
|---|
Acute flaccid paralysis (AFP), characterized by the rapid onset of asymmetric paralysis, can be caused by a variety of viral infections or coinfections (34). AFP in children under 15 years old is currently monitored in countries where poliovirus is still endemic, Pakistan, India, Afghanistan, and Nigeria, as part of the Global Polio Laboratory Network (9). Besides wild-type and revertant vaccine strains of polioviruses, several nonpolio enteroviruses, including human enterovirus species A (HEV-A) serotype EV71, have also been associated with AFP, linked to up to a third of AFP cases in children (6, 11, 16, 31, 32). In the United States, AFP is also observed in 10 to 41% of the estimated <1% of persons newly infected with West Nile virus who exhibit neurological symptoms (17, 30). Recent studies indicate that Chikungunya virus may also cause AFP in rare cases (33).
In this study, we utilized sequence-independent amplification of partially purified viral nucleic acids from stool samples obtained from South Asian children with nonpolio AFP. All samples had been tested previously by cell culture and found to be negative for poliovirus. Sequence data were obtained both by Sanger sequencing of subcloned DNA and by 454 pyrosequencing.
|
|
|---|
Sequence-independent amplification of viral nucleic acids. Viral cDNA synthesis from extracted viral RNA/DNA was performed as described previously (38). Briefly, 100 pmol of a primer containing a fixed sequence followed by a randomized octomer at the 3' end was used in a reverse transcription reaction with SuperScript III reverse transcriptase (Invitrogen). A single round of DNA synthesis was then performed using Klenow fragment polymerase (New England Biolabs) with an additional 50 pmol of the same primer. PCR amplification of nucleic acids was then performed using primers consisting of the fixed portions of the random primers (38). Primers used for downstream plasmid subcloning were based on primer K (GAC CAT CTA GCG ACC TCC ACN NNN NNN N) (35) or RA01 (GCC GGA GCT CTG CAG ATA TCN NNN NNN NNN) (19). For the 10 samples submitted for 454 pyrosequencing, the following primers were used in the PCR and corresponding primers containing an additional eight N residues at the 3' end were used for priming during reverse transcription: 454-A, ATC GTC GTC GTA GGC TGC TC; 454-B, GTA TCG CTG GAC ACT GGA CC; 454-C, CGC ATT GGT CGG CAC TTG GT; 454-D, CGT AGA TAA GCG GTC GGC TC; 454-E, CAT CAC ATA GGC GTC CGC TG; 454-F, CGC AGG ACC TCT GAT ACA GG; 454-G, CGC ACT CGA CTC GTA ACA GG; 454-H, CGT CCA GGC ACA ATC CAG TC; 454-I, CCG AGG TTC AAG CGA GGT TG; and 454-J, ACG GTG TGT TAC CGA CGT CC. Either amplification products were subcloned into bacterial plasmids and Sanger sequenced, or PCR products were sequenced directly using a GS-FLX 454 pyrosequencing system (Roche). Briefly, PCR fragments were treated as sonicated bacterial DNA for 454 pyrosequencing, with polishing of the extremities of fragments and ligation of adaptors prior to emulsion PCR as recommended by the system manufacturer (Roche).
Data assembly and processing. For 454 sequencing, primer tag signatures on the random PCR primers were used as identifier tags to assign sequences to the corresponding sample. Sequences were then automatically trimmed of the fixed primer sequences plus eight additional nucleotides at the 3' end corresponding to the random N residue portion used in cDNA synthesis and Klenow extension. Trimmed sequences were assembled into contigs by using Sequencher software (Gene Codes), with a criterion of 95% identity or greater over 35 bp for 454 pyrosequenced products. Similarly, for cloned and sequenced products, primer sequences were removed from plasmid inserts and assembled by using Sequencher software (Gene Codes), with a criterion of 85% identity over 50 bp. The assembled contigs and singlet sequences were analyzed using tBLASTx. Sequences with tBLASTx E values lower than 10–3 were classified as either eukaryotic viral, bacteriophage, eukaryotic, bacterial, or other based on the match with the best E value. Those sequences with tBLASTx, BLASTn, and BLASTx E values greater than 10–3 for the best hit were deemed unclassifiable.
Sequencing of HEV VP1 region. Nested consensus primers consisting of sequences within the VP3 and 2A genes of HEV-C were designed to amplify the complete VP1 gene. Primers were as follows (5' to 3'): VP1_C_F1, GGBACNCAYRTNATHTGGGA; VP1_C_F2, GCNTGYMMNGAYTTYWSHGT; VP1_C_R1, GGRCACCANVMNCKNACATG; and VP1_C_R2, GYTTDGGYTTCATGTACACTC. Reverse transcriptase PCR conditions were as follows. Ten microliters of extracted viral RNA was incubated with 100 pmol of random hexamer oligonucleotide and 0.5 mM (each) deoxynucleoside triphosphates at 75°C for 5 min. Subsequently, 40 U of RNase inhibitor, 10 mM dithiothreitol, 1x first-strand extension buffer, and 200 U of SuperScript III reverse transcriptase (Invitrogen) were added to the mixture, and the mixture was incubated at 25°C for 5 min and then at 37°C for 1 h. For PCR, 5 µl of the reaction mixture described above was used in a total reaction volume of 50 µl containing 2.5 mM MgCl2, 0.2 mM deoxynucleoside triphosphates, 1x manufacturer's buffer (New England Biolabs), 0.2 µM (each) primers, and 5 U of Taq polymerase (New England Biolabs). For the first round, the PCR cycle included 4 min of denaturation at 95°C; 30 cycles of 95°C for 45 s, 52°C for 1 min, and 72°C for 1 min; and final extension at 72°C for 7 min. One microliter of the first-round PCR product was used as a template in a second round of amplification under identical conditions, with appropriate second-round primers.
Phylogenetic analysis. Reference HEV sequences were obtained from NCBI and edited for alignment using GeneDoc software. Three representative sequences from each serotype, when available, were used. Sequence alignments were generated using the CLUSTAL_W package with the default settings. Aligned sequences were trimmed to match the genomic regions of the sequences obtained in this study and used to generate phylogenetic trees in MEGA4 using either neighbor joining, maximum likelihood, or maximum parsimony with bootstrap values calculated from 1,000 replicates. Accession numbers of reference sequences used are as follows: HEV-A sequences, AY177911 and AY69748 to AY69761; HEV-B sequences, AF029859, AF083069, AF85363, AF105342, AF114383, AF231763, AF233852, AF241360, AF311939, AF317694, AF524867, AJ493062, AY167105, AY302539 to AY302560, AY556057, and AY556070; HEV-C sequences, AB205396, AF081308, AF081310, AF205396, AF465511 to AF465515, AF499635 to AF499643, AF546702, AY876912, and AY876913; and HEV-D sequences, AY426531, DQ201177, and EF107097.
Patient demographics and sample processing. All samples previously tested negative for poliovirus by PCR analysis of cytopathogenic cell culture supernatants in L20M and RD cell lines (ATCC). Patient ages averaged 52 months, with a range of 1 month to 14.5 years. The patients included 24 boys and 11 girls. Control samples were collected from healthy contacts of children with AFP and were assigned identification numbers ending with C, e.g., 5006C.
Nucleotide sequence accession numbers. High-quality sequences and contigs have been deposited in GenBank under accession numbers FI578338 through FI591728.
|
|
|---|
10–3 were categorized according to their closest BLAST match. Totals of 29 and 51% of sequences could not be classified using Sanger and 454 pyrosequencing, respectively, similar to levels from previous metagenomic studies of stool samples (7, 8, 13, 15, 22, 41). Both Sanger and 454 pyrosequencing yielded
23% eukaryotic viral sequences (Fig. 1A and B). |
View this table: [in a new window] |
TABLE 1. Sample summarya
|
![]() View larger version (40K): [in a new window] |
FIG. 1. Sequence classification and distribution. (Left panels) Sequences with E values of 10–3 were classified as either eukaryotic, phage, bacterial, or viral based on the best tBLASTx score. Unclassified sequences are those which had E values of >10–3 by tBLASTx, BLASTn, and BLASTx analyses. Sequences which did not fall into set categories were designated "other" and included fungal and plasmid vector sequences. (Right panels) The subsets of sequences classified as viral were broken down further by family, genus, and species. Classification results for Sanger sequencing clones derived from 35 AFP patients (A), sequences generated by 454 pyrosequencing from a subset of 10 patients (B), and Sanger sequencing clones obtained from six healthy contacts (C) are shown. Others in the viral pie chart of healthy contacts consist of rhinovirus and Aichi virus.
|
![]() View larger version (32K): [in a new window] |
FIG. 2. Viral genome coverage and depth of sequencing. (A) Lines below the graphical depiction of a generic picornavirus genome represent the locations of HEV-B, HEV-C, or human cosavirus (HCoSV) singlets or contigs acquired from different AFP patients by either Sanger sequencing (S) or 454 pyrosequencing (454). 5'-UTR, 5' untranslated region; IGR, intergenomic region. (B) Lines below the graphical depiction of a generic dicistrovirus genome represent the locations of dicistrovirus (Dicis) singlets or contigs acquired from samples from two AFP patients by either Sanger sequencing or 454 pyrosequencing. (C and D) The depth of sequencing for each nucleotide position is shown for 454 pyrosequenced HEV-B from patient 5550 (C) and dicistrovirus from patient 6178 (D).
|
Viral populations in each patient sample. Within each patient sample, the percentages of viral and nonviral sequences varied greatly, consistent with previously published data (13). Eukaryotic viruses were detected in stool samples from 29 (83%) of 35 AFP patients. Based on tBLASTx classification, sequences from seven virus families, plus four novel virus groups, were detected. Of the 29 virus-positive samples, 17 contained at least one HEV, and 5 of these 17 samples appeared to be coinfected with two clearly distinguishable HEV species (Fig. 3 and 4). Targeted amplification from stool samples, using pan-HEV primers (39), indicated that 23 of these 35 tested patients were positive for HEV infection (20), 17 of which were also identified using viral metagenomics. Other known enteric viruses observed included adenovirus, picobirnavirus, rotavirus, Aichi virus, parechovirus, and rhinovirus, as well as assorted plant viruses of the families Partitiviridae and Tobamoviridae (Fig. 3 and 4; Table 1). In addition to previously identified viruses, several potentially novel viruses were observed. These viruses included one with weak identity (<35% amino acid identity) to the insect virus family Dicistroviridae (a member of the order Picornavirales; detected in samples from three AFP patients), viruses belonging to a novel candidate picornavirus genus named Cosavirus (present in eight samples) (20), a novel circovirus-like virus most closely related to porcine circovirus (<55% amino acid identity; present in a single patient sample), a novel human bocavirus (<80% amino acid identity; present in a single patient sample) (18), and a virus displaying weak amino acid identity (<33%) to a fish nodavirus (present in a single patient sample) (Fig. 3 and 4).
![]() View larger version (48K): [in a new window] |
FIG. 3. Sequence classification per patient. (Left pie charts) Sequences generated by subcloning and Sanger sequencing alone from samples from individual patients in which eukaryotic viral sequences were detected were categorized as bacterial (B), unclassified (U), phage (P), eukaryotic (E), viral (V), or other (O). (Right pie charts) Characterization of viral sequences by viral family or species. Values in parentheses are numbers of sequences detected. Virus abbreviations: TMV, tobacco mosaic virus; dicistro-like, dicistrovirus-like virus; and PepMoV, pepper mottle virus.
|
![]() View larger version (36K): [in a new window] |
FIG. 4. Comparison of sequences obtained by Sanger sequencing (upper charts) and pyrosequencing (lower charts). Pie charts are labeled as described in the legend to Fig. 3. In samples from patients 6278 and 6341, no recognized viral sequences were detected by either 454 pyrosequencing or Sanger sequencing (data not shown). Abbreviations: AAV, adeno-associated virus; CMV, cucumber mosaic virus; dicistro-like, dicistrovirus-like virus; noda-like, nodavirus-like virus; and PepMoV, pepper mottle virus. Results for the sample from patient 5727, which had a single tobacco green mosaic virus sequence, are not shown.
|
Potentially new, divergent HEV serotypes.
All HEV singlet sequences or contigs were examined individually by BLASTx and BLASTn similarity searches and by phylogenetic analyses (data not shown). Based on these analyses, three samples were recognized to contain the most divergent HEV sequences, thereby identifying these HEV variants as candidates for new HEV serotypes. HEV variants showing the same antibody neutralization profile (i.e., belonging to the same serotype) have previously been shown to carry VP1 proteins and genes with
88% amino acid identity and
75% nucleotide identity (25). Degenerate PCR primers flanking the VP1 gene were used to amplify VP1 sequences from the three samples with divergent HEV variants. VP1 genes and proteins from samples obtained from subjects 5034, 5044C, and 5048C exhibited 78, 75, and 75% nucleotide identity and 88, 87, and 83% amino acid identity, respectively, to the closest HEV sequences available in GenBank. Sequences from the samples from patient 5034 and healthy contact 5048C exhibited 98% nucleotide identity to each other, were collected in 2007 within 3 months of each other, and were both from the Punjab province of Pakistan. Phylogenetic analyses of amplified VP1 nucleotide sequences show that the sequences from the 5034 and 5048C samples represent deeply rooted CoxA24 serotypes, while the sequence from the 5044C sample may be divergent enough from preexisting genotypes to qualify as a prototype of a new HEV-C genotype (Fig. 5).
![]() View larger version (26K): [in a new window] |
FIG. 5. Unrooted neighbor-joining phylogenetic analysis showing relationships based on the alignment of divergent HEV VP1 gene nucleotide sequences from samples obtained from subjects 5044C, 5034, and 5048C. Filled symbols represent VP1 sequences amplified from samples from AFP patients or healthy contacts of AFP patients. Open symbols represent the corresponding closest BLASTn matches in GenBank. Collapsed branches represent at least three representative sequences from HEV species (HEV-A, HEV-B, and HEV-D) or serotypes (e.g., CoxA21).
|
|
|
|---|
23% of the total, regardless of the sequencing method. Despite the problems associated with shorter sequence reads, pyrosequencing was superior in both viral detection and genome coverage for the 10 patients tested, at approximately the same financial cost. As pyrosequencing technology improves to generate longer sequence reads, it is likely to supplant Sanger shotgun sequencing as a method of viral identification and discovery. Prior viral metagenomic studies of feces utilized Sanger shotgun sequencing of 532 (8), 4,600 (13), and 36,769 (41) plasmid subclones at the cost of analyzing fewer samples (1, 12, and 3, respectively). Two of these studies detected primarily plant viruses (41) or bacteriophage (8) (in this case, likely the result of focusing on dsDNA viruses), while the third study, using diarrhea samples, detected known viral pathogens as well as sequences divergent enough to potentially belong to two new viral species (astrovirus and nodavirus) (13, 14). In our study, the most common plant virus detected was pepper mild mottle virus, which has also been reported to occur at high frequencies in North American and Singaporean human stool samples (41). In addition to bacteriophage and plant viruses, we detected known pathogenic enteric viruses, including rotavirus, adenovirus, picobirnavirus, and numerous members of the Picornaviridae family, including parechovirus, Aichi virus, rhinovirus, cardioviruses, and HEV-A to HEV-C, as well as several new viral species. The high proportion of healthy children with viruses in their stool samples (six of six) (Table 1) underlines the often asymptomatic nature of many enteric viral infections whose clinical outcomes are likely dictated by a combination of viral and host genetics, active and passive immunity (i.e., maternal antibodies), and overall health (26).
Specific nested panenterovirus PCR primers detected HEV infection in 23 of 35 of the AFP cases (20), while 17 of 35 AFP samples exhibited at least one HEV sequence in the viral metagenomic analysis. Both metagenomic analysis and pan-HEV PCR detected HEV infection in all six healthy contacts. This correlation was less pronounced for members of the new candidate picornavirus genus Cosavirus than for HEV, as cosavirus sequences were found in only 9 of 41 samples from AFP patients and healthy contacts by shotgun sequencing, compared to 19 of these 41 by nested PCR (20). Similarly, human cardiovirus SAFV was found in 3 of 57 nonpolio AFP children using shotgun Sanger sequencing and in 9 of 57 patients using RT-nested PCR (5). It is possible that the cosavirus loads in stool samples are generally lower than HEV loads, thereby making detection using limited shotgun sequencing less likely. Indeed, in-depth 454 sequencing of 8,276 clones from the sample from patient 5550 and 25,516 clones from the sample from patient 6572 revealed the presence of cosaviruses missed by Sanger sequencing. These results indicate that while a wide range of distinct viruses (belonging to different and in some cases new viral species) can be detected using low-level Sanger subclone sequencing, the very high sensitivity of nested PCR stills allows more cases of presumably low-level infections with known viruses to be detected.
We detected at least five novel viruses or groups of viruses: a new human bocavirus (18), members of a new Picornaviridae genus (20), a new circovirus (unpublished results), a new nodavirus (unpublished results), and new discistroviruses (unpublished results). Sequences from divergent viruses that may represent new genotypes of enteroviruses, parechoviruses (23), cardioviruses (5), and picobirnaviruses (unpublished data) were also found. The novel nodavirus sequences were clearly distinguishable from the nodavirus sequences recently generated from diarrhea samples, overall exhibiting less than 41% amino acid identity to the previously generated sequences (13). The most diverse viral sequences detected and reported here belonged to the dicistrovirus-like category, in which polymerase and other enzymatic regions exhibited less than 35% amino acid identity to dicistrovirus sequences currently in GenBank. Dicistrovirus-like sequences were detected in samples from three patients, two of which, patients 6178 and 6344, were coinfected with members of the new Picornaviridae genus, Cosavirus. The dicistrovirus-like sequences exhibited 70 to 75% nucleotide identity to one another, a level of divergence otherwise seen among different species of dicistroviruses.
It remains to be determined which of these novel viruses are capable of replication in the human gut, as it is conceivable that some were consumed and their nucleic acids traveled through the digestive tract intact, as attested to by the detection of nucleic acids from plant viruses which have previously been shown to remain infectious (41).
Nodaviruses are small, single-stranded, bipartite RNA viruses that to date have been shown to naturally infect only insects and fish. Nodaviruses have been detected previously in human stool (13) and are semipermissive of replication in mammalian tissues (4, 12). Dicistroviruses have been shown to replicate and be pathogenic in insects (10, 37). The internal ribosomal entry site between the two cistronic segments can act as a powerful promoter in mammalian cells (28). However, reports of viral replication within mammalian cell lines are contradictory; one group has demonstrated the replication of a dicistrovirus, Taura syndrome virus, in human cell lines (3), while another has failed to reproduce Taura syndrome virus growth in mammalian cell lines (27). The ability of pathogenic porcine circovirus 2 to replicate in pigs is well established (1, 24, 36). Whether the circovirus detected in the sample from patient 5006 represents the first human circovirus or a circovirus from ingested meat remains unknown. In vitro replication as well as serological and larger epidemiological studies will be necessary to determine the range of host species tropisms and pathogenic potentials of these new viruses.
Three of the 35 AFP cases were fatal: the sample from child 5550, in which six distinguishable eukaryotic viruses (adenovirus, cosavirus, HEV-B, HEV-C, rhinovirus, and cucumber mosaic virus) were observed, exhibited the highest level of coinfection; patient 2296 was coinfected with HEV-B and a cosavirus; and patient 6178 exhibited coinfection with dicistrovirus and cosavirus. Stool from patient 6178 was likely to contain a high titer of dicistrovirus based on the large fraction of sequence from this virus derived by random amplification (Fig. 2B and D) and dilution end point PCR, which indicated a viral load of approximately 106 genome copies per ml of stool supernatant (data not shown). While cosaviruses were present in all three fatal cases, the difference in cosavirus prevalence among all AFP patients combined and healthy controls was not statistically significant (20). Since even clearly pathogenic picornaviruses, such as poliovirus, typically produce no clinical manifestations in 99 to 99.9% of infections (26), failure to detect a significant association with disease in this small cohort does not absolve cosaviruses, cardioviruses, or other new viruses of possible pathogenic roles.
In summary, we have used limited Sanger sequencing of stool samples from children with AFP to detect both known and novel viruses. By increasing the depth of the nucleic acid sampling using 454 pyrosequencing, we detected more viruses likely present at lower viral loads. These studies provide a framework for further studies that can be applied to numerous cases of AFP reported by the Global Polio Laboratory Network; of the 700,000 cases reported since 1997, only
6.5% have been attributed to poliovirus and 15 to 30% have been attributed to nonpolio enteroviruses (9). PCR studies of stool and tissue samples from subjects of different ages and geographic origins, both with and without diseases, as well as serological testing, will be required to determine the epidemiology and pathogenicity of these new viruses. The numerous known and new viruses in stool samples from developing countries, a likely result of limited access to adequate sanitary conditions resulting in frequent enteric infections, also indicates that such samples provide readily accessible material for further viral discovery.
This research was supported by NHLBI grant R01HL083254 to E.L.D.
Published ahead of print on 11 February 2009. ![]()
|
|
|---|
This article has been cited by other articles:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2009 by the American Society for Microbiology. For an alternate route to Journals.ASM.org, visit: http://intl-journals.asm.org | More Info»