Previous Article | Next Article ![]()
Journal of Virology, September 2006, p. 8869-8879, Vol. 80, No. 18
0022-538X/06/$08.00+0 doi:10.1128/JVI.00510-06
Copyright © 2006, American Society for Microbiology. All Rights Reserved.
Division of Transplant Pathology, Department of Pathology, and Division of Transplant Surgery, Department of Surgery, University of Pittsburgh Medical Center, and Department of Pediatrics, Childrens Hospital of Pittsburgh, Pittsburgh, Pennsylvania
Received 11 March 2006/ Accepted 23 June 2006
|
|
|---|
|
|
|---|
Characterization of genetic diversity of BK virus has biologic as well as clinical implications. This information is needed to seek potential relationships between viral genotype and clinical disease. Sequence data are required by diagnostic virology laboratories to ensure that primers and probes being used for amplification of viral DNA can successfully detect all naturally occurring viral strains. One of the most variable regions of the viral genome is the noncoding control region (NCCR), which shows insertions, deletions, duplications, and complex rearrangements involving enhancer and/or promoter sequences (28, 29). Significant variation is also recognized in the VP1 gene, which codes for VP1, a major structural protein that comprises approximately 80% of the total viral capsid protein. The VP1 protein bears important domains which interact with viral receptors on host cells. A single amino acid change in VP1 protein has been shown to result in increased pathogenicity of mouse polyomavirus (15). Based on VP1 sequence data and restriction enzyme analysis, Jin et al. (20) designed a typing scheme whereby all the existing BKV isolates are classified into four genotypes, which have been designated types I, II, III, and IV. These genotypes correlated well with the serotypes defined by Knowles et al. (24). Knowledge of genetic variation in other regions of the viral genome is limited, even though a number of other functionally important proteins are coded by the virus. Thus, agnoprotein seems to be involved in multiple functions, including enhancing nuclear localization of VP1, viral capsid assembly, virion release, and DNA repair (22). VP2 and VP3 contribute to the scaffolding of the viral capsid architecture. The T antigen promotes viral replication, binds to tumor suppressor proteins Rb and p53, and stimulates host cell entry into the cell cycle (19).
There is only limited BKV whole-genome information published in the literature. Until recently, only three BKV whole genomic sequences were available, namely those from the MM, Dun, and AS strains. In 2004, Chen et al. published 15 whole-genome sequences derived by cloning from three individuals (8). We have now sequenced the complete genomes of BKV isolates from 20 renal transplant recipients. These whole genomic sequences and additional partial sequences published in the literature have been analyzed using phylogenetic methods. The principal aims of our study were to (i) determine the most informative viral genomic regions from a phylogenetic standpoint, (ii) define clades for epidemiological studies incorporating additional sequence data that became available after the publication of Jin et al.'s genotyping schema, and (iii) seek evolutionary relationships between BKV strains isolated from different geographical locations and clinical settings, including healthy individuals, pregnant women, renal transplant patients with asymptomatic viruria or BKV nephropathy, bone marrow transplant recipients with viruria or hemorrhagic cystitis, and systemic lupus erythematosus (SLE).
|
|
|---|
Long-range PCR. DNA was extracted from 5 ml of urine using the QIAGEN Maxi kit (Valencia, CA) and eluted in a final volume of 200 µl of buffer AE, supplied by the manufacturer. The entire circular genome of BKV was then amplified in a single long-range PCR using a strategy that has been successful for JC virus (1). The protocol used was developed by the late Gerald Stoner at the National Institutes of Health. DNA was first digested by 1 U of BamHI (catalog no. R6021; Promega, Madison, WI) in a 10-µl reaction containing 3 µl of DNA and the supplied reaction buffer. The reaction mix was incubated at 37°C for 30 min, followed by 65°C for 20 min and then a 4°C soak. The BamHI digest was then used to amplify the full-length linearized BKV genome using 6 pmol each of primers which overlapped the BamHI site. The primer sequences were the following: BBam 1, GGG ATC CAG ATG AAA ACC TTA GGG G; BBam 2, TGG ATC CCC CAT TTC TGG GTT TAG G. Amplification of BamHI-digested DNA was performed in a 50-µl reaction using rTth polymerase-based long-range PCR (catalog no. N808-0188; Applied Biosystems, Foster City, CA). PCR conditions consisted of a first denaturation step of 94°C for 4 min, 14 cycles of denaturation at 94°C for 40 s, annealing at 64°C for 6 min, 5 cycles of denaturation at 94°C for 40 s, annealing at 64°C for 8 min, 5 cycles of denaturation at 94°C for 40 s, annealing at 64°C for 10 min, 5 cycles of denaturation at 94°C for 40 s, annealing at 64°C for 12 min, 5 cycles of denaturation at 94°C for 40 s, annealing at 64°C for 14 min, and a final elongation period at 72°C for 10 min, followed by a soak at 4°C. Cleanup of the long-range PCR product was performed using Millipore PCR Cleanup columns (catalog no. UFC7PCR50; Billerica, MA).
DNA sequencing. The long-range PCR product (30 to 90 ng of DNA) was then subjected to a cycle sequencing reaction using Big Dye chemistry (Big Dye Terminator Cycle Sequencing kit, v. 3.1; Applied Biosystems Inc., Foster City, CA) and 3.2 pmol of the appropriate forward or reverse sequencing primer. A set of 30 sequencing primers spanning the entire viral genome was designed for this project (Table 1). PCR conditions consisted of a first denaturation step of 96°C for 2 min, followed by 25 cycles of denaturation at 96°C for 30 s, annealing at 50°C for 15 s, and extension at 60°C for 4 min. Cleaning up of the cycle sequencing PCR products was performed using CentriSep 96-well plates (catalog no. CS-961; Princeton Separations Inc., Adelphia, NJ) using the manufacturer's instructions. DNA sequencing was performed at the University of Pittsburgh Genomics and Proteomics Core Facility using an ABI 310 automated sequencer. Sequences were trimmed, analyzed, and assembled using Sequencher 4.2 (Gene Codes Corporation, Ann Arbor, MI). All base calls were verified manually and compared to the Dun reference sequence (GenBank accession no. V01108). The presence of nucleotide polymorphisms was accepted only if the chromatogram reading was unambiguous and the observed changes occurred in more than one overlapping sequence at that nucleotide position. All forward primers were run in duplicate to ensure multiple coverage of polymorphic sites. This paradigm corrects for the possibility of sequencing errors being introduced during PCR as a result of infidelity in DNA polymerase.
|
View this table: [in a new window] |
TABLE 1. Sequencing primers for the BKV genome
|
|
View this table: [in a new window] |
TABLE 2. BKV strains phylogenetically analyzed in this study
|
|
|
|---|
Phylogenetic trees were made using MP, NJ, and UPGMA methods (Fig. 1 to 3). MP analysis was based on a total of 4,465 sites, of which 233 were parsimony informative. The MP original tree had a consistency index of 0.85, retention index of 0.93, and rescaled consistency index of 0.79. An MP tree constructed from the 45 aligned complete sequences is shown in Fig. 1. It is a bootstrap consensus tree derived from 31 equally parsimonious trees calculated by MEGA 3.1. There are six major clusters supported by bootstrap values of >50%. For simplicity, these six clusters are designated A, B, C, D, E, and F. To eliminate the possibility that the tree structure was skewed by inclusion of multiple isolates from the same patient, we also constructed trees using only a single consensus sequence from each patient for whom multiple sequences were available. The six major clusters were retained (data not shown).
![]() View larger version (9K): [in a new window] |
FIG. 1. An unrooted bootstrap consensus tree was constructed by the maximum parsimony method using 45 whole-genome sequences. The presence of six major phylogenetic clusters is noted. The numbers at the nodes are the bootstrap confidence levels (in percentages) obtained for 1,000 replicates (values 50% are indicated). Strain notations and origins are described in Table 1.
|
![]() View larger version (10K): [in a new window] |
FIG. 3. Unrooted consensus NJ tree constructed from the large T-gene region derived from 45 whole-genome sequences. A phylogenetic clustering similar to that obtained with the whole-genome data was observed from the large T data analysis. Strain notations and origins are described in Table 1.
|
All the remaining isolates were found to form two clusters, E and F. These two clusters, supported by high bootstrap values, separated out from the major subtypes Ia, Ic, III, and IV and are, therefore, proposed to represent newly recognized genotypes V and VI, respectively. Cluster E, representing the proposed genotype V, consists of several Pitt strains associated with viruria (PittVR1, PittVR2, PittVR3, PittVR5, PittVR6, PittVR7, and PittVR10), two associated with nephropathy (PittNP2 and PittNP3), four associated with viremia (PittVM1, PittVM3, PittVM4, and PittVM5), and three clones derived from a healthy subject (HC-u2, HC-u5, and HC-u9). Interestingly, healthy control sequences (HC-u5, HC-u2, and HC-u9) from a subject in the Boston area formed a subcluster separate from the Pitt isolates. Cluster F, representing proposed genotype VI, consists primarily of cloned sequences derived from a single patient with BKV vasculopathy and capillary leak syndrome (CAP-m2, CAP-m5, CAP-m9, CAP-m13, CAP-m18, CAP-mh2, CAP-m5, CAP-m8, and CAP-m22), a single subject with HIV infection (HI-u5, HI-u6, and HI-u8), and Pitt isolates with viruria (PittVR8), viremia (PittVM2), and nephropathy (PittNP1). The HIV sequences formed a subcluster separate from the remaining sequences in this cluster, supported with a bootstrap value of 99%. Additional data are needed to determine if this separation reflects a specific disease association or the common origin of all HIV clones from the same patient. All the clones from the CAP patient formed a cluster separate from the renal transplant sequences from Pittsburgh.
The primary branching of BKV strains into six major clusters seen in the MP tree was quite well reproduced by the NJ tree, with bootstrap values of >80%. Indeed, even second-order subclusters were identical in both trees, with some bootstrap values as high as 100%. The existence of six major clusters was further confirmed by phylogenetic analysis using the UPGMA method (data not shown).
VP1 sequences. In order to determine if the major phylogenetic clusters A, B, C, D, E, and F observed with whole-genome sequences could be reproduced by using gene-specific data, we constructed trees using subsets of the whole-sequence information. Initially, complete 1,089-bp alignments were constructed from the 45 whole-genome sequences. This alignment showed that 86.13% of the positions were invariant. The bootstrap consensus tree constructed by the NJ method was in agreement with the whole-genome tree. The six major clusters A, B, C, D, E, and F contained the same viral strains shown in Fig. 1 and 2. MP analysis produced 164 equally parsimonious trees, differing from each other and from the NJ tree only in second-order branching patterns within the clusters. The aforementioned analysis did not allow us to ascertain the relationship of our clusters V and VI to several previously published and only partially sequenced strains which had been classified by the Jin schema. Hence, we constructed another tree (Fig. 2) of 50 complete VP1 gene sequences, including the 45 strains shown in Fig. 1; type II strain SB (17); type Ib strains WW (7, 18, 40), DIK (18, 40), and JL (30, 40); and type Ic strain MT (39, 40). In this VP1 region tree, Jin genotype II (SB) and genotype III (AS) appeared to branch off from a common ancestor, supporting the separate recognition of these two genotypes. However, the type Ib JL strain grouped with genotype V strains in cluster E, whereas WW and DIK type Ib strains clustered with genotype VI strains in cluster F. The MT strain grouped with subtype Ic strains in cluster B.
![]() View larger version (9K): [in a new window] |
FIG. 2. Unrooted consensus NJ tree using 1,000 bootstrap replicates constructed from the VP1 gene region from 45 whole-genome sequences and the reference strains. The six phylogenetic clustering patterns were reproduced by the VP1 gene. No distinct subgroup Ib cluster was identified. JL clustered with subtype V strains, and WW and Dik clustered with subtype VI. Strain notations and origins are described in Table 1.
|
The six phylogenetic clusters were resolved by the analysis of the Japanese isolates and sewage isolates (248 bp long). Only partial separation of the clusters was obtained by the analysis of the SLE (214 bp long) and the sewage isolates (160 bp long), indicating insufficient sequence data information. No separate clustering of the sewage or SLE isolates was observed. In the analysis of the Japanese isolates, BMT isolates KOM-1, KOM-5, KOM-9, and KOM-19 and RT isolate THK-2, classified earlier as subtype Ib by Takasaka et al. (43), were observed to cluster with type VI isolates (42). As previously reported by Takasaka et al. (43), KOM-3 and RYU-7 grouped as type III, and THK-3, THK-8, TW-3, TW-3a, TW-9, KOM-2, KOM-12, KOM-22, RYU-3, and RYU-5 grouped as type IV. The remaining Japanese BKV isolates clustered as type Ic. Japanese BKV strains and strains originating in Pittsburgh clustered separately by NJ, UPGMA, and MP methods.
Large T, small t, agnogene, VP2, and VP3 sequences. Analysis restricted to large T antigen successfully resolved clusters A through F with high bootstrap values (85% by NJ method) (Fig. 3). These major clusters could also be recognized in trees made from VP2 and VP3 sequences (data not shown). However, if agnogene or small t sequences alone were examined, the complete resolution of the six major clusters was not observed. This is likely due to the smaller size of these sequence data sets as well as greater evolutionary conservance of the corresponding genes.
Gene polymorphisms in phylogenetic clusters. An attempt was made to define the polymorphic sites that help distinguish the major clusters in our phylogenetic trees. Table 3 lists the gene-specific distribution of these sites. Table 4 provides a detailed listing of all polymorphisms which can assist viral genotyping. The only polymorphic sites listed are those shared by all of the isolates in a particular cluster. It is apparent that the variations are not randomly distributed but seem to be arranged in "hot spots." As expected, the most polymorphic region is VP1, where 13.86% of all sites showed variation, although only 34.44% of these helped distinguish between clusters (Table 3). Significant variation was also found in the large T-antigen gene, wherein polymorphisms were found in 11.39% of all nucleotide sites, 46.2% of which were cluster specific. This degree of genetic variability in the BKV T-antigen gene has not been previously documented. Agnogene was the most conserved coding area in the viral genome.
|
View this table: [in a new window] |
TABLE 3. Number of nucleotide differences in each gene
|
|
View this table: [in a new window] |
TABLE 4. Genotyping sites for the 45 whole-genome sequences
|
|
View this table: [in a new window] |
TABLE 5. Amino acid changes differentiating clusters in 45 whole-genome tree
|
|
View this table: [in a new window] |
TABLE 6. Amino acid pattern observed in VP1 protein
|
|
|
|---|
Phylogenetic analysis of all 45 full-length sequences established six clusters, designated A, B, C, D, E, and F. Cluster A contained isolates exclusively from genotype Ia. Cluster B (subtype Ic) was represented by isolates from renal transplant patients from Japan. Cluster C (subtype III) included strain AS, which is the only genotype III whole-genome sequence published to date. Cluster D (subtype IV) was represented by TW-3a, a Japanese renal transplant isolate. All the remaining isolates were divided among the clusters E and F. Phylogenetic analysis of VP1, VP2, VP3, and large-T-derived gene-specific sequences by three different methods corroborated our whole-genome analysis, demonstrating the existence of two previously unrecognized clusters, E and F, which we propose to call genotypes V and VI, respectively. These new genotypes seem to have independently branched off from the four other clusters, A, B, C, and D. Due to the lack of an appropriate outgroup for BKV phylogenetic analysis, it is not possible for us to determine the order in which different BKV genotypes have evolved.
Based on conventional alignments of VP1 sequences, existing BKV strains have been classified into genotypes I, II, III, and IV (7, 18, 37). Genotype I has been subdivided into subgroups Ia (Dun and MM strains), Ib (Dik, JL, and WW strains), and Ic (MT) (43). Phylogenetic analysis of the VP1 gene sequences in our study supports the existence of genotypes Ia and Ic. However, no separate cluster corresponding to genotype Ib was identified. Of BKV strains designated Ib previously, Dik and WW clustered with type VI strains, while JL clustered with type V strains. As noted by Chen et al., the recognition of genotype Ib is based essentially on differences at three nucleotide positions, 1698, 1809, and 1923 (8). Genotype Ib-defining substitutions at these three sites do not result in amino acid changes and are not expected to result in distinct viral serotypes. Hence, genotype Ib may not be a biologically relevant taxonomic subgroup. In contrast, Chen et al. (8) did observe genotype Ic to represent a distinct phylogenetic cluster, similar to our analysis in which the MT strain clusters with other Ic strains published by Takasaka et al. (42). The amino acid sequence of subtype Ic was unique at position 225 in VP1 protein and position 91 in VP2 protein. Genotypes V and VI, as defined by us, have distinctive nucleotide substitutions at positions 427, 1022, 1091, 1109, 1322, 1575, 1908, 2076, 2127, 2559, 2619, 2708, 3205, 3652, 3654, 3673, 3736, 3749, 3904, 4417, 4435, 4575, 4606, 4626, and 5103 (Table 4). These nucleotide substitutions result in four nonsynonymous amino acid changes: one each in agnoprotein, VP2, VP3, and large T protein.
Although not informative for genotyping, amino acids at positions 82, 340, and 362 in the VP1 protein showed an interesting pattern of changes, which suggested that amino acid substitutions at these three locations might result in type-determining changes in three-dimensional protein configurations. In genotype II, III, and IV isolates, amino acids at all three positions were substituted compared to the Dun reference Ia sequence (Table 6), whereas type VI strains showed no substitution at any of these three positions. In genotype Ic, amino acid substitutions occurred only at positions 82 and 340, whereas in type V any one of the three sites could be mutated. Amino acid position 82 can be predicted to map to the BC loop, whereas positions 340 and 362 were mapped to predicted "C-insert" and "C loop" of the C-terminal region, respectively (27). Since the BC loop is believed to interact with the cellular receptor for BKV, it can be speculated that the genotype-specific amino acid changes might alter BKV tissue tropism. By the same token, changes in the C-terminal region may have implications with regard to the efficiency of viral capsid assembly. The biologic basis for the amino acid substitution constraints observed at positions 82, 340, and 362 is not clear, since the BC loop and C terminus are not known to interact with each other.
BKV is primarily a kidney pathogen. However, a BKV (Yale) strain has been amplified from a leukemia patient who had lytic infection in both the kidney and brain (38). A partial VP1 sequence of this unique strain reportedly showed three mutations within the VP1 gene (positions 1687, 1702, and 1908), which distinguished this strain from other type I sequences. Many of the genotype V BKV strains from Pittsburgh were identical to the Yale strain at positions 1687 and 1908, but none showed the G1702C mutation. The latter mutation results in a Glu-to-Gln mutation, which leads to predicted changes in ß structure of the coded protein and a postulated increase in the tropism of BKV for brain tissue. The G1702 mutation, however, has also been reported by Chen et al. (8) in one of three clones derived from an HIV-infected patient who is not known to have viral encephalitis.
In our follow-up renal transplant patients in the clinic, we have observed that while 30% of patients develop asymptomatic BK viruria, only 1 to 2% develop viral nephropathy. Whole-genome phylogenetic analysis does not suggest any major evolutionary distinction between viral strains obtained from patients with or without nephropathy, since sequences derived from both clinical categories did not form distinct clusters. This suggests that the intensity of immunosuppression and genetic susceptibility of the host immune system, possibly regulated by host gene polymorphisms, are the principal determinants of whether or not a particular patient will develop BKV-mediated tissue injury in the kidney.
Small data sets of viral sequences derived from healthy patients, HIV-infected subjects, Wiskott-Aldrich syndrome patients, and patients with BKV vasculopathy with capillary leak syndrome did form discrete subclusters. More sequences from such patients are needed to determine whether this finding reflects specific disease association or a distinct epidemiologic origin of the BKV strains selected for study. It is pertinent to note that sequences derived from the patient with capillary leak syndrome did not show any functionally significant mutations (8). A few amino acid changes were found in the VP1 region, but they did not result in any predicted change in VP1 serotype or cellular receptor interaction. Likewise, amino acid changes in the T antigen did not affect the DNA binding domain, host range domain, phosphorylation sites, or any other critical part of this multifunctional molecule.
A phylogenetic separation of Japanese and Pittsburgh BKV isolates from renal transplant patients was seen on analysis of partial VP1 sequences. However, this observation may reflect overrepresentation of subtypes Ic and IV in the Japanese isolates and genotype V in the Pittsburgh BKV strains. It is, thus, unclear whether this separation is due to geographic or genotypic distinction. However, the related polyomavirus JC is known to have specific viral genotypes associated with particular geographical regions (23, 43).
The NCCR includes the regulatory region enhancer and promoter sequences as well as origin of replication (ori). Genetic changes in the regulatory region are known to occur during viral growth in vitro, and it is believed that these changes can promote the selection of tissue culture-adapted strains (7, 8, 41). Whether similar changes occur in the coding region of the viral genome is not known. To address this question, we compared the DNA sequences of BKV Dunlop (36), AS (44, 46), UT (41), and MM (46) strains with 42 other strains that were sequenced directly after DNA amplification from clinical specimens. The cell-cultured strains MM, Dun, UT, and AS differed from the other strains at position 420 in the agnogene (A
T) and at position 5142 in the T-antigen/small t common gene region (A/G
C). The large T-antigen region also showed several differential substitutions at positions 3034, 3592, 3874, and 4980, with A, T, C, and C in the cell-cultured strains and T, A, T, and T in remaining strains. This observation suggests that in vitro culture can result in nucleotide substitutions in the coding regions of the viral genome. Since all nucleotide alterations were synonymous, the functional effect of these substitutions is doubtful. Nonetheless, these observations demonstrate that BKV is a rapidly evolving virus in vitro and in vivo. No nucleotide differences specific for the cell-cultured strains were observed in VP1, VP2, and VP3 genes.
In summary, our data document the phylogenetic diversity of BKV and have established the existence of clades not previously recognized in the literature. Currently, there is only limited whole-genome sequence data available for many important subject categories, including healthy individuals, pregnant women, patients with systemic lupus erythematosus or HIV infection, and bone marrow transplant recipients. Additional whole-genome sequence information is also needed for BKV genotypes II, III, and IV and for viral strains associated with metropolitan sewage. Expanded phylogenetic analyses with these additional BKV sequences will provide a global and statistically more robust classification schema for classifying BKV into types and subtypes, as has been accomplished for polyomavirus JC (17). Such an effort would allow better definition of the pathogenicity and tissue tropism of specific BKV viral strains encountered in nature. The phylogenetic diversity of BKV also has important implications for clinical diagnostics. Microbiology laboratories should remain alert to the possibility of continuing evolution of BKV, and the potential need for future modifications in currently utilized PCR assays, to ensure that all clinically relevant occurring viral strains can be successfully detected in clinical samples.
We thank Sujata Patel for technical assistance and Basavaraju Sankarrappa for help with Clustal W software. Christopher Cubitt and Caroline Ryschkewitsch at the NIH generously provided an established experimental protocol for BKV whole-genome sequencing.
|
|
|---|
This article has been cited by other articles:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2009 by the American Society for Microbiology. For an alternate route to Journals.ASM.org, visit: http://intl-journals.asm.org | More Info»