ABSTRACT
The etiology of a large proportion of gastrointestinal illness is unknown. In this study, random Sanger sequencing and pyrosequencing approaches were used to analyze fecal specimens from a gastroenteritis outbreak of unknown etiology in a child care center. Multiple sequences with limited identity to known astroviruses were identified. Assembly of the sequences and subsequent reverse transcription-PCR (RT-PCR) and rapid amplification of cDNA ends generated a complete genome of 6,586 nucleotides. Phylogenetic analysis demonstrated that this virus, named astrovirus VA1 (AstV-VA1), is highly divergent from all previously described astroviruses. Based on RT-PCR, specimens from multiple patients in this outbreak were unequivocally positive for Ast-VA1.
Astroviruses consist of a family of small, single-stranded, positive-sense RNA viruses. Their genomes range from 6.1 to 7.3 kb in length (6, 14) and contain three open reading frames (ORFs) denoted ORF1a, -1b, and -2, which encode a serine protease, an RNA-dependent RNA polymerase (RdRP), and a capsid precursor protein, respectively (14). Astroviruses are known to infect a variety of species (3, 12, 20). In humans, eight serotypes have been described, which have been associated with up to ∼10% of sporadic cases of diarrhea in children (2, 7, 10, 11, 17) and 0.5 to 15% of outbreaks (1, 13, 18). In addition, a highly divergent member of this family, astrovirus MLB1, was recently identified in patients with diarrhea (5, 6).
Significantly, the etiologies of 12 to 41% of all gastroenteritis outbreaks remain undetermined even after extensive testing, suggesting that there is a diagnostic gap (13, 18). In this paper, we applied mass sequencing to analyze specimens obtained from an unexplained outbreak of gastroenteritis at a child care center. We report the identification and complete genome sequencing of a novel astrovirus, referred to as astrovirus VA1 (AstV-VA1) (S. R. Finkbeiner, Y. Li, S. Ruone, C. Conrardy, N. Gregoricus, D. Toney, H. W. Virgin, L. J. Anderson, J. Vinjé, D. Wang, and S. Tong, U.S. patent application).
Details of outbreak.
On 18 August 2008, the Eastern Shore Health District in Virginia was notified of cases of gastrointestinal illness at a child care center, with the outbreak lasting for a period of 2 to 3 weeks. Any attendee or staff member of the day care who had diarrhea and/or vomiting after 1 July 2008 fit the case definition for this outbreak. Control measures were put in place immediately at the center, including exclusion of symptomatic children, mandated testing of all symptomatic staff, testing of symptomatic children, and ultimately, temporary closing of the facility. By the conclusion of the outbreak, 26 patients fit the case definition. From these patients, fecal specimens from six patients (labeled A to F) (Table 1) were available for extensive testing. All six samples tested negative for known enteric parasites and enteric bacteria by standard microscopy analysis and culture. Similarly, all six samples tested negative for rotavirus (RotaClone enzyme immunoassay), norovirus, sapovirus, human astrovirus, and group F adenoviruses by reverse transcription-PCR (RT-PCR) (4, 16, 21), with the exception of samples B and F, which were intermittently positive at the limit of detection for human astrovirus.
Epidemiologic data of six specimens from a child care center outbreak of acute gastroenteritisa
Genome amplification and sequencing.
Five of the fecal specimens (A to E) were analyzed independently in two laboratories by mass sequencing. At Washington University, total nucleic acid was extracted from diluted fecal specimens A, B, C, and D and randomly amplified as previously described (22), and the products were subjected to high-throughput pyrosequencing using the GS-FLX Titanium platform (Roche) (average of 12,730 reads per sample). We identified 313 unique high-quality sequence reads in sample B and 1,017 unique high-quality reads in sample C which were divergent from but most closely related to astroviruses, based on BLAST alignments. No astrovirus sequences were detected in sample A or D. A 6,376-nucleotide (nt) contig was assembled from the sequences detected in sample B, and four contigs totaling 6,026 nt were assembled from sample C. Because the overlapping sequences obtained in samples B and C were identical, the five original contigs were assembled to generate a 6,581-nt contig [excluding the poly(A) tail].
At CDC, total nucleic acid was extracted from samples A, B, C and E and randomly amplified as described previously (22). Amplicons 300 to 800 bp in length were then cloned using the TOPO TA cloning kit (Invitrogen, Carlsbad, CA), and plasmids were sequenced using the Sanger method on an ABI Prism 3130 automated sequencer (Applied Biosystems, Foster City, CA). Three out of 96 clones from sample B and 69 out of 152 clones from sample C contained sequence signatures most closely related to previously known astroviruses by BLASTn similarity searches. Sequencing of 100 clones each from samples A and E yielded no clones with detectable similarity to astroviruses. The 69 clones from sample C were assembled into four contigs. Primers were then designed to generate a series of eight overlapping RT-PCR amplicons with an average size of ∼900 bp that yielded a contig of 6,537 nt. In order to define the 5′ end of the genome, three independent rapid amplifications of 5′ cDNA ends were performed, and a total of 23 clones from these reactions were sequenced. All clones extended the genome by 49 nt and yielded the identical 5′ end sequence, suggesting that the genome was complete with a total length of 6,586 nt, excluding the poly(A) tail. The 3′ end was confirmed by rapid amplification of 3′ cDNA ends.
Comparison of the genome sequences generated by the two sequencing methods yielded nearly identical sequences, with the exception of five missing nucleotides at the 5′ end of the contig generated by pyrosequencing and three nucleotide substitution differences. These were resolved by direct PCR sequencing to generate the final, corrected sequence, which was deposited in GenBank (accession number FJ973620).
Genome analysis.
The genome of AstV-VA1 has three predicted ORFs as well as nontranslated regions (NTRs) at both the 5′ and 3′ ends of the genome. ORF1a and ORF2 were predicted by the NCBI ORF Finder (Table 2). The full coding region for ORF1b, which is produced by a −1 ribosomal frameshift during translation (8), was defined using the conserved heptameric “slippery sequence” (AAAAAAAC) near the end of ORF1a as the start site (8). The sequence AUUUGGAGNGGNGGACCNAAN5-8AUGNC (start codon for ORF2 is italicized) located upstream of ORF2, which has been proposed as the promoter for subgenomic RNA synthesis in all previously known astroviruses (14), is also present in AstV-VA1 with only two nt differences. The 3′ NTR of nearly all astroviruses contains a highly conserved RNA secondary structure called the stem loop II-like motif (s2m) (9, 15). An alignment of the 150 nt just upstream of the poly(A) tail of AstV-VA1 with the 3′ NTR sequences of other astroviruses demonstrated that AstV-VA1 contained the highly conserved ∼33-nt core of the s2m motif. The exact role of this motif is not understood; however, its presence in multiple viral families suggests it may play an important role in the astrovirus life cycle.
Genome comparison of AstV-VA1 to other fully sequenced astroviruses
Phylogenetic analysis.
ClustalX (version 1.83) was used to align each complete ORF of Ast-VA1 with the respective available complete ORFs of other astroviruses in GenBank. Maximum parsimony trees were then generated using PAUP with 1,000 bootstrap replicates (19). This analysis demonstrated that AstV-VA1 was highly divergent from but most closely related to mink and ovine astroviruses in ORF1a and ORF1b (Fig. 1A and B). In the capsid region, for which more astrovirus sequences are available, AstV-VA1 was most similar to mink and California sea lion astroviruses (Fig. 1C).
Phylogenetic analysis of AstV-VA1 ORFs. Phylogenetic trees were generated in PAUP, using the maximum parsimony method with 1,000 bootstrap replicates. Significant bootstrap values are shown. (A) ORF1a serine protease. (B) ORF1b polymerase. (C) ORF2 capsid. HAstV, human astrovirus; Bat AFCD337, MpAstV/HK/AFCD337/06; Bat LD71, TmAstV/GX/LD71/07; Bat LC03, HpAstV/GX/LC03/07; Bat LD38, TmAstV/GX/LD38/07.
In terms of sequence identity, as expected, ORF1b was the most highly conserved region, sharing 61% amino acid identity to mink astrovirus and 62% to ovine astrovirus. The ORF1a (serine protease) coding region was more divergent, with 39% and 40% amino acid identities with ovine astrovirus and mink astrovirus, respectively. In ORF2, AstV-VA1 shared 41% amino acid identity to mink astrovirus and 41% to California sea lion astrovirus 1.
RT-PCR screening for AstV-VA1.
Real-time RT-PCR and semi-nested RT-PCR assays were developed, targeting regions in ORF1b and ORF2 of AstV-VA1, respectively. All six samples were tested with both assays (Table 1). Four independent nucleic acid extractions of each sample were prepared. Each extraction of samples B, C, and F was unequivocally positive in both assays, with threshold cycle (Ct) values in the real-time RT-PCR assay ranging from 18 to 20, suggesting that a high copy number of Ast-VA1 was present in those samples. The other three samples were intermittently weakly positive in the semi-nested RT-PCR assay (A, 1/4 extractions; D, 3/4 extractions; E, 1/3 extractions) and in the real-time RT-PCR assay (A, 1/4 extractions; D and E, 3/4 extractions). For the real-time RT-PCR assay, in the instances where these three samples were positive, the Ct values were near the limit of detection, ranging from 34 to 42. These results suggest that samples A, D, and E may contain very low copy numbers of AstV-VA1 RNA, which may explain the variation in results for the four independent nucleic acid extractions. Negative controls included on each run were all negative. The 250-bp amplicon generated by the semi-nested PCR assay was confirmed as AstV-VA1 in all samples by sequencing.
Despite the availability of improved molecular diagnostic methods for an increasing panel of gastroenteritis agents in humans, the etiology of 12 to 41% of the outbreaks of gastroenteritis remains unexplained (13, 18). In this study, we identified a novel astrovirus (AstV-VA1) in fecal samples from an outbreak of acute gastroenteritis. Complete genome sequencing and phylogenetic analysis demonstrated that AstV-VA1 was highly divergent from all previously described astroviruses, including the eight human astrovirus serotypes and the recently described astrovirus MLB1 (AstV-MLB1) (6). The discovery of AstV-VA1 following the recent identification of AstV-MLB1 clearly demonstrates that a much greater diversity of astroviruses exists in humans than was previously recognized.
The detection of AstV-VA1 at high copy numbers in three out of six samples (and potentially at very low levels in the other three samples) from this outbreak suggests a potential association between AstV-VA1 and diarrheal illness. However, because of the limited number of samples available for analysis in this cluster, further studies defining the frequency of detection of AstV-VA1 in samples from individuals with and without acute gastroenteritis are needed to define the role of AstV-VA1 in human diarrheal disease.
Nucleotide sequence accession number.
The nucleotide sequence determined in this study was deposited in the GenBank database under accession number FJ973620.
ACKNOWLEDGMENTS
This work was supported in part by National Institutes of Health grant U54 AI057160 to the Midwest Regional Center of Excellence for Biodefense and Emerging Infectious Diseases Research. D.W. holds an Investigators in the Pathogenesis of Infectious Disease Award from the Burroughs Wellcome Fund.
The findings and conclusions in this article are those of the authors and do not necessarily represent the views of the Centers for Disease Control and Prevention. This article did receive clearance through the appropriate channels at the CDC prior to submission.
FOOTNOTES
- Received 17 May 2009.
- Accepted 31 July 2009.
- Copyright © 2009 American Society for Microbiology