Previous Article | Next Article ![]()
Journal of Virology, May 2005, p. 6487-6504, Vol. 79, No. 10
0022-538X/05/$08.00+0 doi:10.1128/JVI.79.10.6487-6504.2005
Copyright © 2005, American Society for Microbiology. All Rights Reserved.
Plum Island Animal Disease Center, Agricultural Research Service, United States Department of Agriculture, Greenport, New York 11944,1 Area of Virology, School of Veterinary Sciences, University of Buenos Aires, 1427 Buenos Aires, Argentina,2 Department of Pathobiology and Veterinary Science,3 Center of Excellence for Vaccine Research, University of Connecticut, Storrs, Connecticut 062694
Received 4 August 2004/ Accepted 18 January 2005
| ABSTRACT |
|---|
|
|
|---|
| INTRODUCTION |
|---|
|
|
|---|
FMDV is a small nonenveloped virus with a pseudo T=3 icosahedral capsid made up of 60 copies each of four structural proteins. The capsid surrounds an 8.4-kilobase, positive-sense, single-stranded RNA genome which is covalently bound at its 5' end to the small viral protein 3B (or VPg) and is polyadenylated at its 3' end. Upon virus entry into a cell, the viral genome is rapidly translated into a polyprotein which is co- and posttranslationally cleaved by viral proteinases into several partially cleaved, likely functional, intermediates and ultimately into 12 mature proteins (87, 97).
The FMDV genome organization is similar to that of other picornaviruses, including a large single open reading frame (ORF) flanked by highly structured 5' and 3' untranslated regions (5' UTR and 3' UTR, respectively) (Fig. 1A). The 5' UTR consists of, from the 5' end, a 350- to 380-nucleotide (nt) "short" (S) fragment, a 100- to 420-nt poly(C) tract (90% C), and the approximately 700-nt 5' terminus of the genomic "long" (L) fragment, which contains three or four tandemly repeated pseudoknots, a stem-loop cis-acting replication element (cre), and a type II internal ribosome entry site (IRES) (69). The FMDV 5' UTR plays important roles in cap-independent translation initiation of the viral polyprotein and in viral genome replication (64). The 3' UTR is about 90 nt long and is thought to contain cis-acting elements required for efficient genome replication (4).
|
Despite a basic understanding of many aspects of picornaviral biology, much information regarding FMDV UTR, protein, and protein precursor functions and roles in virulence, host range, and virus transmission remains poorly understood. Comparative genomic analysis of a large number of FMDV genomes may allow the identification of highly conserved regulatory or coding regions which are critical for aspects of virus biology. To date, the few comparative analyses of full-length FMDV genomes have relied on intratypic (serotype O) or intertypic (serotypes O, A, and C) comparisons of a small number of isolates and serotypes (70, 86). In addition, complete genomic sequences of some serotypes are not available (SAT1 and SAT3) or, for others, only represent highly cell culture-adapted isolates (serotypes A and C). Studies involving large numbers of virus isolates have largely focused on genomic regions encoding structural proteins associated with serotype specificity for phylogenetic purposes (reviewed in reference 55).
For this work, we obtained and analyzed 103 complete genome sequences representing all FMDV serotypes, including the previously unavailable SAT1 and SAT3 genomes. Our analyses identified novel highly conserved genomic regions indicating functional constraints for variability, novel viral genomic motifs with likely biological relevance, and previously undescribed virus lineages.
| MATERIALS AND METHODS |
|---|
|
|
|---|
|
Complete viral genomes were reverse transcribed to cDNAs by use of a Super SMART PCR cDNA synthesis kit (Clontech), using the supplied modified oligo(dT) according to the manufacturer's instructions. After spin column purification, the full-length double-stranded cDNA from each isolate was amplified in eight identical long-distance PCRs (2 µl cDNA in a 50-µl reaction volume; 20 cycles of 95°C for 15 s, 65°C for 15 s, and 68°C for 2 min) using primers supplied in the kit. Next, 21 overlapping 1.2-kb PCR amplicons spanning the whole FMDV genome were generated in 96-well plates (2.5 µl of template from each of the eight double-stranded cDNA replicates in 25-µl reactions using the primers described in Table S1 in the supplemental material). Adjacent amplicons overlapped for >50% of their lengths. To prevent truncation by subsequent rounds of PCR, we artificially created poly(C) sequences by substituting flanking sequences with specific primers (F1R and F2F) containing six C's or G's.
Amplified products of the correct size were purified (DNA filtration system; Eppendorf) and resuspended in 40 µl of sterile double-distilled H2O per well. All amplicon replicates (eight per FMDV subgenomic fragment per isolate) were subjected to direct sequencing (2.5 µl of template per reaction) using position-specific forward and reverse primers (see Table S2 in the supplemental material), Big Dye Terminator cycle sequencing kits (Applied Biosystems), and a PRISM 3730xl automated DNA sequencer (Applied Biosystems). To overcome virus variability and PCR/sequencing errors, we sequenced each amplicon in multiple reactions with both forward and reverse primers, specifically selected by serotype (see Table S2 in the supplemental material). Since each amplicon overlapped its neighbor >50%, the multiple direct sequencing reactions resulted in a redundancy rate ranging from 13 to 48 and averaged 28 sequence events per nucleotide.
Direct DNA sequencing of amplicons derived from a given FMDV isolate yielded a master sequence representing the most probable nt for each position of the sequence. This approach prevented analyses of minor sequence variants, polymerase misincorporation errors, and sequencing ambiguities through multiple independent cDNA synthesis, PCR amplification, and direct sequencing events. Due to the quasispecies nature of FMDV populations, polymorphisms were detected at some nt positions. Nevertheless, all positions could be unambiguously assigned to a single nt due to the high degree of redundancy generated by the sequencing strategy. The following sequence ambiguity code was used: K (T/G), M (A/C), R (A/G), S (C/G), W (A/T), Y (C/T), B (C/T/G), D (A/T/G), H (A/C/T), V (A/C/G), and N (A/C/G/T).
Sequence analysis. Bases were called from chromatogram traces produced with Phred (25), which also produced a quality file containing a predicted probability of error at each base position. Viral sequences were assembled with the Phrap (http://www.phrap.org) and CAP3 assemblers (43). Gap closure was performed as described previously (3). Multiple sequence alignments were performed with the ClustalW (1.7) and Dialign (2.2) computer programs (73, 107). The positions of 5' UTR sequences presented in Results refer to the consensus sequence generated by ClustalW alignment of this region for the 103 isolates, with some manual editing. Nucleotide substitution analysis was carried out by use of the DISTREE (1.2) (103), DNArates (1.1) (105), ALISTAT (16), and PRETTY (GCG 10 software package [16]) programs. Analyses of codons and synonymous/nonsynonymous (syn/nonsyn) substitution ratios were calculated by using SNAP (77), CodonW (http://www.molbiol.ox.ac.uk/cu/), and codeml (PAML3.14 package), which was also used for statistical evaluations of heterogeneous selection pressures at amino acid sites (116). For protein analysis, the PRETTY program was used. Searches for motifs and/or signal sequences were performed with the MOTIFS (GCG package), HMM (58), pfscan (http://www.isrec.isb-sib.ch/ftp-server/pftools/pft2.2/pft2.2.tar.Z), and Blocks (36) programs of the PROSITE, Pfam, and Blocks databases. Transmembrane protein segment predictions were performed by using Memsat (50), Tmpred (41), Toppred (114), Psort (76), and Saps (11). Secondary structure predictions were performed for proteins by using GOR secondary structure prediction (29) and Pratt (48, 49) and for RNAs by using mfold (46, 117), rnafold (40), and Squiggles (83). Pseudoknot analysis was conducted with pknots (96).
Phylogenetic analysis was performed on aligned genomic and subgenomic regions of FMDV by utilizing the neighbor-joining and split decomposition methods as implemented in the Phylo_win (28) and/or SplitsTree 4.0 (http://www-ab.informatik.uni-tuebingen.de/software/jsplits/welcome_en.html) (44) software package and by utilizing maximum likelihood as implemented in Puzzle (104) and dnaml in the Phylip package (http://evolution.genetics.washington.edu/phylip.html) (27). Individual protein-coding regions were used to screen for incongruent tree topologies suggestive of genomic recombination, and split decomposition analysis (7) of genomic and subgenomic regions was utilized to graphically screen for reticulated branching patterns which were also suggestive of recombination. Similarity plots and bootscanning analysis (100), which compare a given query sequence to several reference sequences via incremental sliding sequence windows to yield corrected similarity values and bootstrap resampling frequencies, respectively, were performed as implemented in the SimPlot, v. 2.5, package (63), utilizing default settings and a window size of 400 nt. As a further test for potential recombination, Sawyer's run test (102), as implemented in GENECONV, v. 1.81 (http://www.math.wustl.edu/
sawyer), was used on Dialign-aligned FMDV polyprotein-encoding regions, using default settings.
Nucleotide sequence accession numbers. The GenBank accession numbers for the genome sequences of FMDV isolates sequenced for this study are listed in Table 1.
| RESULTS AND DISCUSSION |
|---|
|
|
|---|
FMDV genomes ranged from 8,046 to 8,215 nt, consistent with previously reported genome lengths (86, 108). Although small insertions/deletions were observed in the coding region affecting Lpro, 1B, 1C, 1D, and 3A, most variability in genome size was due to insertions/deletions in the UTRs.
5' UTR. The approximately 1,300-base 5' UTR plays important roles in FMDV replication. However, the specific contribution of S fragment, poly(C) tract, and L fragment pseudoknots to FMDV biology is unknown. The cardiovirus poly(C) tract has been shown to affect virulence (23), while shortening of the FMDV poly(C) tract affected virus growth in vitro but not virulence in a suckling mouse model (95). The cre is essential for picornaviral replication and contains a conserved AAACA motif which in poliovirus functions as a template for 3Dpol-mediated uridylylation of 3B (75). Mutation of the FMDV cre stem region reduced virus RNA replication (68). The approximately 500-nt FMDV IRES, which is responsible for cap-independent polyprotein translation (60), is predicted to contain four structural domains (Fig. 1D, panel II, domains 2 to 5) that interact with cellular factors involved in host translation initiation (94). Domain 2 binds the polypyrimidine tract-binding protein at a UUUC motif. Domain 3 contains at least three loop structures that are conserved in all picornaviruses, including a GNRA motif essential for the maintenance of the IRES tertiary structure and potentially involved in long-range RNA-RNA interactions. The Y-shaped domain 4 and domain 5 interact with the host translational factors eIF4B and eIF4G (65, 99). A 22-nt spacer for ribosomal recognition separates the IRES from the first of two polyprotein start codons, and an additional 84 nt separates the first start codon from the second. Picornavirus utilization of the second start codon seems to be affected by its adjacent sequence and by sequences in the IRES (82). Unlike poliovirus, FMDV utilizes the second start codon twice as often as the first (66).
The analysis in this study of 5' UTRs from 103 FMDV isolates showed that only 12% and 33% of nt were invariant in the 5' UTR S and L fragments, respectively (Fig. 1B and C; Table 2). In pairwise comparisons, however, FMDV isolates averaged 80% and 85% nt identity for the S and L fragments, respectively, indicating a high degree of conservation between isolates. This suggests that although the FMDV 5' UTR can tolerate changes at most positions, selective pressures clearly favor overall conservation, likely for the maintenance of a functional secondary structure. Relatively high rates of substitution (>10) in the 5' UTR L fragment were observed at positions 513, 558, 615, 684, 696, and 1144 (Fig. 1B).
|
|
In most cases, secondary structure analyses of S fragments predicted a single stem-loop structure similar to that previously proposed (12, 24). However, the S fragment sequences of 16% of the isolates folded into alternative structures within similar free energy levels (e.g., strains A22 Turkey/65, A24 Argentina 65, and Asia 1 Leb 83) (data not shown).
5' UTR L fragment. The 5' UTR region of the L fragment ranged from 604 to 751 nt long. Remarkably, apparently unrelated virus isolates contained identical deletions in the L fragment (Table 3). SAT and Euroasiatic-specific L fragment insertions/deletions were identified, as were isolate-specific insertions/deletions, including an 18-nt insertion located 28 nt downstream of the poly(C) tract in strain A5 Westerwald/51.
The pseudoknot region between positions 403 and 600 was highly tolerant to changes, with no invariant nt located within the first 200 positions and insertions/deletions tolerated at each position (Fig. 1C). Previously undescribed invariant nt and motifs located between the pseudoknot region and the IRES included AGAAWYGGGACGU (positions 617 to 629), GCRCACGWAACGCGC (positions 632 to 646), and ACAAAC (positions 668 to 673) within the cre.
FMDV IRES sequences (positions 640 to 1151) showed 70 to 100% nt identity in pairwise comparisons, 47% invariant nt, and numerous invariant motifs in the predicted secondary structure domains (Fig. 1C and D). Domain 2 contained the polypyrimidine tract-binding protein motif UUUC and three previously undescribed motifs at the base, bulge (invariant GGUCUWGAG motif), and apex regions of the stem-loop. Domain 3 contained GYRA (corresponding to the conserved picornaviral GNRA motif) and CRAAA (except in isolates O1Argentina/65 and OAkesu/58) motifs in loops A and B, respectively. While the picornavirus domain 3 loop D motif ACCC was present in 99/103 FMDV isolates, the loop 3C motif ACAC was not conserved in FMDV. Domain 3 also contained novel highly conserved motifs, including the UCGUMGCGGAGCA (positions 823 to 835) motif and GRUACUGGUA and GRGACUGGUA motifs (positions 965 to 974), specific for Euroasiatic and SAT serotypes, respectively, and CUGGWGRCAGGCUAAGGAUGCCCU (positions 983 to 1006), predicted to form a prominent bulge at the base of the stem (Fig. 1C and D). Domain 4 was highly conserved, with two novel invariant motifs (GAUCUGAG [positions 1039 to 1046] and UUAAAAG [positions 1080 to 1087]) predicted to form prominent bulges in the secondary structure. In domain 5, 19 of the 21 nt were invariant (Fig. 1C and D). The conservation of these motifs suggests biological significance. Notably, the 22-nt region between the IRES and the first AUG was highly variable, possibly underlying FMDV's preferential use of the second start codon (9, 66). In summary, the FMDV 5' UTR, although tolerant for nt substitutions and insertions/deletions, showed significant conservation, especially in the IRES, where novel sequence motifs were identified.
Polyprotein region. (i) Nucleotide sequence and codon analysis. The nucleotide variability for the 103 FMDV polyprotein (ORF) and individual protein regions is summarized in Table 2. Overall, 46% of all nt were invariant, with 73% nt identity between the least similar pair of sequences. An average transition/transversion (Ts/Tv) rate of 2.4 and syn/nonsyn substitution ratio of 2.1 were observed. As expected, the region encoding 1D was the most variable, exhibiting the lowest percentage of invariant nt (21%), Ts/Tv rate (1.55), and syn/nonsyn ratio (1.03) of all regions. In fact, excluding the highly conserved 1A (VP4) protein, regions encoding structural proteins had significantly lower Ts/Tv rates, nonsyn substitutions/site, and syn/nonsyn ratios than the rest of the genome. Although the substitution rates given here are averages for each protein-coding region, specific regions or residues within each protein were observed to contain higher or lower substitution rates (Fig. 2B and C).
|
rates between different substitution models further indicated that four of these proteins (Lpro, 1D, 3A, and 3B) may indeed undergo diversifying selection (Table 4) (116). Notably, this was observed for relatively few codons (1.5% to 7.6%), with fewer highly significant ones (P > 0.9) tending to cluster within each protein (positions 19, 20, 22, 23, and 82 in Lpro; positions 45, 48, 142, 143, 144, and 146 in 1D; positions 44, 132, 135, 136, and 144 in 3A; positions 4 and 11 in 3B1; and positions 17, 18, and 19 in 3B2). Interestingly, the variable capsid proteins 1B and 1C did not appear in this analysis to undergo diversifying selection (Table 4).
|
|
|
Lpro lacked insertions/deletions, with the exception of two additional aa in Euroasiatic lineages (positions 22 and 23). Each methionine start codon was invariant, indicating that both Lpro isoforms are significant for aspects of FMDV biology. Consistent with the previously described hypervariability in the region flanked by the two AUGs, only one residue (C6) was invariant in this region (113). Only 44% of the residues in Lb were invariant, making it much less conserved than was previously reported (30). Although substitutions were concentrated in terminal regions of Lpro, the predicted secondary structure in these regions remained relatively unaltered (data not shown).
A detailed analysis of the Lpro/1A junction resulted in a more ambiguous cleavage motif than was previously described (Table 5) (113). Only GAGXS at the 1A N terminus and the previously described basic residue (K/R) required for cleavage at the Lpro C terminus were invariant. Residues required for Lpro catalytic activity (C52, H149, and D165) (32, 59), suggested to be involved in Lpro autocatalysis (E77) (89), and important for eIF4G cleavage (H110 and H139) (90) were invariant, except for an H110D substitution in all SAT viruses, an E77Q substitution in O6 Pirbright/65, and an E77K substitution in A Phillipines/75 (Table 6). The present FMDV analysis revealed that only 43 of the 65 residues previously identified as conserved between FMDV O1 Kaufbeuren/66 Lpro and the aphthovirus equine rhinitis A virus Lpro were invariant, with 35 of the 43 residues concentrated in three distinct regions (positions 44 to 63, 95 to 110, and 133 to 185) (38) (Table 6). Additional, previously undescribed invariant FMDV Lpro motifs are shown in Table 6.
|
Structural proteins. FMDV structural proteins are involved in capsid assembly and stability, virus binding to target cells, and antigenic specificity, influencing significant aspects of virus infection and immunity (45). The high level of variability in FMDV external capsid proteins observed here likely reflects the selective pressures on them.
(i) 1A (VP4). 1A was the most conserved FMDV protein, with 81% of the aa being invariant (Fig. 3; Table 4), including the N-terminal myristylation site and a previously identified heterotypic swine and bovine T-cell epitope (1A positions 20 to 35) (10). Interestingly, the Q73 residue in the SAT viruses distinguishes them from Euroasiatic lineages. Residues potentially specific for SAT2 and SAT3 (I76) and for SAT1 (V80) were also identified.
(ii) 1B (VP2). 1B, a protein of 218 or 219 aa, plays a critical role in virion structural stability and maturation (14). N-terminal regions of 1B contain previously undescribed motifs. For example, DKKTEETTLLEDRILTTRNGHT(T/I)STTQSSVG and DKKTEETT(L/H)LEDRI(L/M/V)TT(S/R)H(G/N)TTTSTTQSSVG are conserved among Euroasiatic and SAT virus isolates, respectively. Three previously identified T-cell epitopes (positions 48 to 68, 114 to 132, and 179 to 187) were conserved (88). Notably, only the N-terminal half of each epitope was invariant, while their C-terminal regions were highly variable.
(iii) 1C (VP3). 1C, a protein of 219 to 221 aa, contains important conformational neutralizing epitopes and makes significant contributions to capsid stability (62). Among the FMDV isolates examined here, 1C was highly variable, with only 39% of the aa being invariant. Most amino acid substitutions were concentrated in four regions (positions 55 to 88, 130 to 140, 176 to 186, and 196 to 208), with insertions/deletions present in the first two regions and a previously described T-cell epitope present in the second region (88). The 1B/1C cleavage site was nearly invariant in Euroasiatic lineages but much less conserved among SAT isolates (Table 5).
(iv) 1D (VP1). 1D is the most studied FMDV protein due to its significance for virus attachment and entry, protective immunity, and serotype specificity (45). 1D ranges in size from 217 to 221 aa, with insertions/deletions contained in regions 140-150 and 166-170. Overall, only 26% of the 1D residues are invariant. The invariant residues and motifs are shown in Fig. 3. Given the available data and the proposed significance of the conserved RGD motif for virus reception and pathogenesis (71, 72), it is notable that complete integrin-binding RGD motifs were lacking in 9 of the 103 FMDV isolates examined here (GGD in C Waldman strain 149 and C4 Tierra del Fuego/66, TGD in C5 Argentina/69, RGE in Asia1/2 Isrl 3/63 and A21 Kenya/64, RDD in A25 Argentina/59 and A Canefa 1/61, HGD in Asia1/3 Kimrom, and PGD in A27 Colombia/67) and in additional sequences present in GenBank (RGN in Asia1 Pak/1/54 and O/Syria/1/87, RSG in A A/IND/110/99, RGE in O KEN/5/95, KGD in O PAK/1/94 and O/IRQ/26/2000, SGD in O Akesu/58, and IGD in O3/Venezuela/51), with substitutions tolerable at any one of the three residues.
Exposure of the 1D N terminus, a region which includes a 10-aa motif which is conserved in several other picornaviruses, is critical for poliovirus entry into the cell (13). The FMDV 1D sequences studied here lack this N-terminal motif, suggesting that some aspects of viral entry may differ from those of other picornaviruses.
Structural implications of capsid protein amino acid conservation. Picornaviral 1A is cleaved from the 1AB (VP0) precursor by unknown mechanisms, and this cleavage is required for virion maturation and infectivity (6, 74). During poliovirus entry into host cells, 1A is involved in receptor-mediated capsid conformational changes resulting in membrane ion channel formation and the release of viral RNA into the cytoplasm (reviewed in reference 42). Several FMDV residues which are known or suspected to affect 1AB cleavage based on their homology to other picornaviral 1ABs were identified in the FMDV isolates examined here. Notably, a number of these residues were located in previously unidentified invariant sequence motifs, suggesting that the critical function of these residues may be contextual. Specifically, these included 1B residues H145, P144, and L83 and 1C residues G39, F41 (replaced by Y in three isolates), and A50, which is included in the conserved motif LDVAEACPT (positions 45 to 53) (8). Additional residues contributing to the 1AB cleavage pocket include 1D P204, contained in the motif RMKRAE(T/L)YCPR (positions 195 to 205), and 1B V32 (I in SAT2 viruses), T33, and Y36, included in the motif TTSTTQSSVG(V/I)T(Y/F)GY (positions 22 to 36). C-terminal to this last motif lies a putative serotype sequence signature (positions 36 to 47) confirmed for 173 available 1B sequences (YSTXEDHXXGPN in A viruses, YXTXEDFVXGPN in O viruses, YATXEDXXGPN in C viruses, YXVXEDAVSGPN in Asia1 viruses, YAXXDXFLPGPN in SAT1 viruses, YADXDSFRXGPN in SAT2 viruses, and YXSADRFLPGPN in SAT3 viruses) and the motif GPNT(S/N)GLEXRVxQAER(F/Y)(F/Y)K (positions 45 to 63), which is present in all FMDV isolates examined here.
An H-rich region at the 1B/1C interphase is proposed to mediate 1B-1C hydrogen bonding and is likely involved in virion sensitivity to acid, a characteristic with implications for virus stability and transmission (2). 1B residues H21 (position 19 in SAT viruses), H145, H157, and H174 were invariant in all FMDV isolates studied here, while H87P and H168Y substitutions were present in serotype C and Euroasiatic viruses, respectively. In addition, 1C contained five invariant H residues, at positions 86 (84 in SAT viruses), 109, 146, 149, and 198 (196 in SAT viruses).
The N termini of five 1C molecules make up the ß annulus at the axis of virion fivefold symmetry, thus contributing to capsid stability (1). The 53 N-terminal residues of 1C, including P4, which is possibly involved in binding to the 3Cpro substrate-binding site, and C7, a residue that is invariant in Euroasiatic isolates and involved in disulfide bonding between 1C N termini, were invariant within serotypes (1). However, all SAT viruses demonstrated an unexpected nonconservative C7V substitution, suggesting that SAT virus capsids may exhibit distinct physical properties.
Nonstructural proteins. (i) 2A.
FMDV 2A is an 18-aa peptide similar to the C terminus of cardiovirus 2A, a protein which induces a modification of the cellular translation apparatus resulting in 2A release (19, 20). The FMDV 2A proteins examined here averaged an 89% amino acid identity. Fourteen residues were identified as
98% invariant, including the DVEXNPG motif, which is essential for encephalomyocarditis virus 2A activity (21). The high level of 2A conservation likely reflects structural and functional constraints associated with the small size of the protein.
(ii) 2B. FMDV 2B localizes to sites of virus genome replication in ER-derived vesicles (106). The 2B protein in other picornaviruses enhances membrane permeability, blocks protein secretory pathways, suppresses apoptotic responses by affecting intracellular Ca2+ homeostasis, and is implicated in virus-induced cytopathic effects (17, 47, 112).
FMDV 2B, a protein of 154 aa, contains 117 invariant residues (Table 4), with amino acid substitutions limited to only one or two alternate residues per site. A previously undescribed, conserved transmembrane domain was predicted between positions 120 and 140, suggesting that 2B is an integral membrane protein and consistent with the proposed localization of 2B to ER-derived vesicles (106).
(iii) 2C. FMDV 2C is homologous to poliovirus 2C, an ATPase affecting the initiation of minus-strand RNA synthesis and whose precursor, 2BC, induces vesicle formation in the cytoplasm (53). FMDV 2C localizes to membrane-associated virus-replicating complexes (106).
FMDV 2C is 318 aa long and contains 72% invariant residues, including those of the putative ATP/GTP binding domain (positions 110 to 116, 160 to 163, and 243 to 246) (15). Amino acid substitutions unique to individual SAT serotypes were identified between positions 33 and 92.
(iv) 3A. FMDV 3A has been implicated in virus virulence and host range, similar to the 3A proteins of other picornaviruses (33, 61). Deletions in 3A have been associated with FMDV attenuation in cattle and with the porcinophilic phenotype of O Taiwan/97 (54). However, evidence suggests that other viral genetic determinants are necessary for these phenotypes (54, 81).
FMDV 3A ranges from 143 to 153 aa, with insertions/deletions preferentially occurring at positions 70 to 110 and 130 to 150, regions identified here as containing previously undescribed variability. In fact, 3A is one of the most variable proteins encoded by FMDV, with fewer invariant aa (37%) than the variable capsid proteins 1B and 1C and the highly variable Lpro and contrasting with the high degree of conservation previously reported for a limited number of PanAsia 3A sequences (54). As described above, 3A contains residues predicted to undergo positive selection, including the Q44 residue, at which a Q44R mutation was previously associated with a pathogenic phenotype in guinea pigs (78). Although the Q44R mutation was present in several guinea pig-adapted isolates examined here, this mutation was absent from other isolates adapted to guinea pigs and present in still others that were not adapted to guinea pigs, suggesting that alternative mutations may underlie this particular phenotype. A previously described transmembrane domain (positions 60 through 76) (54) which tolerated conservative amino acid substitutions at all positions except invariant residues L64, L68, A70, and I72 was also variable in this study. Our data revealed limited variability in the potential 3A T-cell epitope that was previously identified between positions 21 and 35 (10). The variable nature of 3A suggests that it may be highly informative for epidemiologic or forensic purposes. Additionally, the likely role for 3A in virulence and host range suggests that interactions with host factors underlie 3A's variability and the diversifying selection predicted to act upon it.
(v) 3B. FMDV is the only picornavirus to encode multiple 3B proteins (3B1, 23 aa; 3B2, 24 aa; and 3B3, 24 aa), whose homologue in poliovirus primes genomic RNA synthesis during virus replication (26, 85). In our study, 3B1, 3B2, and 3B3 were present in all FMDV isolates. The motif GPYXGP (except for the substitution GPYXRP in Sat 1/20 Rv 11/37), which contains a Y residue homologous to the poliovirus Y3 residue involved in phosphodiester linkage of 3B to the 5' end of the viral genome, was invariant in the N terminus of each protein (5) (Table 7). The 3B C-terminal regions contained more amino acid variability, including the majority of observed nonconservative substitutions and the fewest invariant residues. Notably, 3B3 was highly conserved in all isolates examined, supporting previous experimental evidence indicating that only this 3B isoform is essential for virus viability (26, 84). 3B1 and 3B2 were more variable, and in fact, contained residues predicted here to undergo diversifying selection (3B1 residues 4 and 11 and 3B2 residues 17, 18, and 19). These data, in conjunction with experimental data indicating that the deletion of 3B1 and 3B2 may affect FMDV virulence and host range (84), suggest that, similar to that of 3A, 3B1 and 3B2 protein variability reflects host range-specific functions.
|
(vii) 3D. The 469-amino-acid viral RNA-dependent RNA polymerase 3Dpol, responsible for generating minus- and plus-sense genomic RNA, is one of the most conserved FMDV proteins. Our analysis indicated that although it is conserved, 3Dpol is more tolerant of substitutions than was previously reported (56), as our results extended the proportion of variable residues from 8.6% to 26%. D245, N307, and G295, which are essential for maintaining the functional integrity of the picornaviral polymerases (56), were invariant in FMDV 3Dpol, along with other residues described as being highly conserved among all RNA-dependent RNA polymerases (31), including the NTP-binding residues G337, D338, and D339 (115). The previously described picornaviral polymerase peptide motifs KDELR, PSG, YGGD, and FLKR were conserved among all FMDV isolates analyzed (18, 51). Finally, the three previously described hypervariable, hydrophobic antigenic regions of 3Dpol (aa 1 to 12, 64 to 76, and 143 to 153) were also variable in all virus isolates examined here (31).
3' UTR. The picornaviral 3' UTR binds several viral and host proteins and is believed to contain structural cis-acting elements required for negative-strand RNA synthesis (4). Removal of the terminal poly(A) tract or mutagenesis of structural elements abrogates the infectivity of FMDV infectious clones (98). FMDV 3' UTR sequences stimulate virus-specific, IRES-dependent translation (67) and likely affect other aspects of viral replication, including genome circularization (37).
The FMDV 3' UTRs were highly variable in length among the isolates, ranging from 85 to 101 nt. A secondary structure analysis of the FMDV 3' UTR confirmed the Y shape which is also predicted for other picornaviral 3' UTRs, suggesting that the structure plays an important role in the 3' UTR function (Fig. 1D and data not shown). A previously unidentified motif was located at the vertex of the Y structure between positions 37 and 61 (Fig. 1C and D). In some cases, however, structural features which could conceivably affect the efficiency of ribosome/RNA dissociation and translation were observed (for example, the exceptionally long stem of A10 Holland/42 and the short stem of O2 Brescia/47) (data not shown).
FMDV phylogenetic and recombination analysis. To date, phylogenetic analyses have been performed largely on FMDV sequences from the 1D coding region. These analyses have permitted the discrimination among serotypically related FMDV isolates (55). Split decomposition, a clustering technique used previously on 1D sequences to suggest the occurrence of quasispecies evolution in Euroasiatic FMDV (22), was used to examine the complete genome sequences of all 103 isolates described here (Fig. 4A). The results indicated that complex phylogenetic relationships exist between members of different serotypes, including relationships between the A24 Argentina/65 and European/South American O1 viruses; between A12 Valle strain 119 and C Waldman strain 149; between the O1 M11, O2 Brescia/47, and O3 Venezuela/71 viruses and European serotype A viruses; between 05India/62 and Asia1 viruses from Israel; among six SAT1 and three SAT3 viruses; and most notably, between SAT1/7 Isrl 4/62 and SAT2/3 Kenya 11/60 and other SAT1 and SAT2 viruses (Fig. 4A and data not shown).
|
Taken together, these data suggest that FMDV capsid protein sequences may undergo intertypic recombination infrequently relative to that in other genomic regions, which conceivably undergo complex recombination events and fail to display serotype-specific phylogenetic relationships. These observations are consistent with and extend previous reports of FMDV recombination, which has been inferred from incongruent Lpro, 3Cpro, and 1D topologies and observed (and suggested to predominate) in C-terminal genomic regions (52, 57, 101, 113). Similar observations have recently been made for enterovirus genomes, which were suggested to undergo relatively extensive recombination in nonstructural gene regions while generally maintaining serospecific monophyly (79). These observations, including the inability to reliably define viral relationships based on nonstructural protein-encoding sequences and data suggestive of recombination, raise interesting questions about FMDV genome evolution in nature and the relative contribution of recombination to the generation of FMDV genetic and population diversity.
Notably, the analysis described here identified a novel SAT lineage represented by SAT1-7 Isrl 4/62 and SAT2-3 Kenya 11/60, which contain nonstructural protein-encoding regions that are divergent from each other but are also clearly distinct from those of the other SAT and Euroasiatic lineages presented here (Fig. 4C). The distinct nature of nonstructural proteins from these structurally/serotypically SAT-like viruses was supported by a previously available genome sequence similar to that of SAT2-3 Kenya (SAT2 Ken/3/57; 97% overall nucleotide identity) (Table 1) and by previous data indicating that the SAT2 Ken/3/57 Lpro- and 3C-encoding regions may be distinct from those of other SATs (102). Taken together, these results suggest that more FMDV genome diversity may exist in nature than is currently indicated by serology or 1D sequence analysis.
| ACKNOWLEDGMENTS |
|---|
| FOOTNOTES |
|---|
Supplemental material for this article may be found at http://jvi.asm.org/. ![]()
| REFERENCES |
|---|
|
|
|---|