Previous Article | Next Article ![]()
Journal of Virology, December 2004, p. 13613-13626, Vol. 78, No. 24
0022-538X/04/$08.00+0 DOI: 10.1128/JVI.78.24.13613-13626.2004
Copyright © 2004, American Society for Microbiology. All Rights Reserved.
Deutsches Krebsforschungszentrum, Heidelberg, Germany
Received 7 May 2004/ Accepted 28 July 2004
| ABSTRACT |
|---|
|
|
|---|
is present in groups A5, A6, A7, A9, and A11, PVs highly associated with malignant carcinomas of the cervix and penis. E5ß is present in groups A2, A3, A4, and A12, i.e., viruses associated with certain warts. E5
is present in group A10, and E5
is encoded in groups A1, A8, and A10, which are associated with benign transformations. The phylogenetic relationships between mucosal human PVs are the same when considering the oncoproteins E6 and E7 and the E5 proteins and differ from the phylogeny estimated for the structural proteins L1 and L2. Besides, the protein divergence rate is higher in early proteins than in late proteins, increasing in the order L1 < L2 < E6
E7 < E5. Moreover, the same proteins have diverged more rapidly in viruses associated with malignant transformations than in viruses associated with benign transformations. The E5 proteins display, therefore, evolutionary characteristics similar to those of the E6 and E7 oncoproteins. This could reflect a differential involvement of the E5 types in the transformation processes. | INTRODUCTION |
|---|
|
|
|---|
The most studied mucosal HPV E5 protein is HPV16 E5. It is a small, highly hydrophobic protein, 83 amino acids (aa) long (6, 44), which localizes in the Golgi and in the endoplasmic reticulum (37). According to in silico predictions and to circular dichroism analysis, it has three hydrophobic domains with an alpha helix structure, which could cooperate in rendering the final spatial arrangement (2). Many disparate functions have been described for HPV16 E5, but we still lack a proper hypothesis bringing them all together into a comprehensible framework. Thus, the expression of HPV16 E5 upregulates the signal cascade initiated by the epidermal growth factor receptor upon ligand binding, through mitogen-activated protein kinases (14, 47). E5 also binds to the 16-kDa subunit of the membrane H+-ATPase, responsible of the acidification of the late endosomes (1, 12). HPV16 E5 modifies the cell response leading to initiation of apoptosis, both ligand mediated and stress induced. Thus, E5-expressing cells are less sensitive to Fas and apoptosis induced by the tumor necrosis factor alpha-related apoptosis-inducing ligand (24) and also less prone to apoptosis after UV irradiation (55). Besides, HPV16 E5 reduces gap junction-mediated intercellular communication via dephosphorylation of connexin 43 (36). This results in the cease of tissue homeostatic feedback, which has also been described as an early event in carcinogenesis progression (27). Finally, the expression of E5 blocks the traffic to the plasma membrane of major histocompatibility complex class I (MHC-I) and MHC-II molecules, thus hampering antigen presentation and T-cell recognition (7, 54). This finding correlates with the in vivo MHC-I diminished surface expression in premalignant lesions and in most carcinomas of the cervix (13, 19, 42).
Only a certain amount of working knowledge about HPV16 E5 is available thus far, and only scattered reports on the functions of other mucosal HPV E5 proteins have been published. However, it can be hypothesized that whatever the mechanisms connecting the disparate effects associated with HPV16 E5, they emerge from a central effect related to the hydrophobic character of the protein and its localization in the Golgi apparatus (15). In this sense, the only feature common to all E5 proteins is their highly hydrophobic nature and their location in the PV genetic map, and the genetic map is strictly conserved in PVs (45). E5 proteins are encoded in the E2-L2 region of the PV genome. This region is present in mucosal HPVs (supergroup A), ungulate fibropapillomaviruses (supergroup C), and animal and human cutaneous PVs (supergroup E); it is absent in EV- and melanoma-associated PVs (supergroup B) (9). The only criterion hitherto used to name a putative open reading frame (ORF) as E5 was its presence in this E2-L2 segment. This fact has led to a proliferation of putative E5 proteins even within a single genome, as is the case of E5a, E5b, and E5c proteins in HPV18 and in HPV54. Moreover, some ORFs in the E2-L2 region have been identified as E5 despite the absence of a start codon, as is the case of HPV26 E5 or HPV30 E5. Finally, some ORFs that are not encoded in the E2-L2 region but overlap E2 and/or L2 have also been termed E5, as in BPV4 E5 or HPV1 E5. The number of sequences identified as putative E5 proteins has therefore increased to 110, but their chemistry, biology, and phylogeny are largely unknown.
In the present work we have analyzed the phylogenetic and chemical relationships between the mucosal HPVs E5 proteins and have identified four different families of related E5 proteins. We describe here the evolutionary characteristics of these proteins and compare them with those of the early oncoproteins E6 and E7 and with those of the structural proteins L1 and L2. The divergence rate and overall evolutionary pattern of the E5 proteins resemble those of the oncoproteins E6 and E7 and differ from that of the late proteins L1 and L2. Furthermore, we illustrate here for the first time a correlation between the phylogenetic classification of the mucosal HPVs attending to the E5 proteins and their involvement in cervical cancer.
| MATERIALS AND METHODS |
|---|
|
|
|---|
An initial set of putative E5 sequences was obtained and analyzed. All putative ORFs carried in the E2-L2 sequences above listed, longer than 30 aa and displaying an initial methionine or leucine were identified with the ORF Finder program and included in the analysis. Additionally, other sequences named E5 in the public databases that did not fulfil these criteria were also included. Thus, E5 sequences from HPV types 5, 26, 30, 41, 66, and 69; BPV4; and PsPV identified as such by the original depositaries were included despite the absence of a starting codon. Moreover, E5 sequences from BPV4 and EcPV, also identified by the depositaries, were included despite they were not encoded in the E2-L2 segment of the corresponding viruses. The total number of putative E5 proteins in this initial data set was 119. A preliminary phylogenetic analysis was performed with these sequences, as described below. We defined two phylogenetic and chemical coherence criteria for accepting an E5-like sequence as such. We assumed first that phylogenetically close viruses should display phylogenetically close E5-like translations. Second, we assumed that phylogenetically close E5-like translations should show similar overall chemistry of the polypeptide chain. This chemical coherence was assessed as described below. Only 71 of the 119 sequences accomplished both criteria and were therefore named E5 proteins. These sequences belonged to HPV types 2, 3, 6, 7, 10, 11, 13, 16, 18, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 39, 40, 42, 44, 45, 51, 52, 54, 55, 57, 58, 59, 61, 66, 67, 68, 69, 70, 71, 72, 73, 74AE10, 77, 83, 84, 86, 87, 89, 90, and 91; CPV1; PCPV1; BPV1 and BPV2; RPV; OPV1 and OPV2; EEPV; and DPV. The corresponding E6, E7, L1, and L2 sequences form these viruses were also retrieved and analyzed in parallel to the E5 sequences.
Phylogenetic analysis. The initial alignments were generated with TCOFFEE, which combines information for both global and local homologies (35). When E2-L2 DNA sequences were aligned, both gap opening and extension end were highly penalized to avoid high sequence alignment scores due to random similarities between long sequences. This precaution was necessary considering the differences in length among the aligned sequences. The result was the input for phylogenetic analysis with the PHYLIP program package (18). A distance matrix was generated with DNADIST or with PROTDIST with Dayhoff PAM250 as a substitution matrix. This output was analyzed with DNAPARS or with PROTPARS to generate a maximum parsimony tree, and with neighbor-joining and FITCH programs to create distance-based trees. The statistical support was assessed by 1,000 cycles of bootstrapping with the SEQBOOT and CONSENSE programs. The clusters and arrangements of individual viruses and virus groups obtained with neighbor-joining and FITCH were similar. The same procedure was performed after generating the initial alignments with CLUSTAL W (23), a progressive alignment algorithm, and with DIALIGN, a local segment alignment algorithm (31). The overall topology was the same in all cases, and only minor changes regarding distances were noticeable.
Divergence distances from the present E5, E6, E7, L1, and L2 proteins to the corresponding ancestral nodes for the group, clade, or protein ancestor were measured in the consensus phylogenetic tree. Distances for each protein were averaged, and differences were considered significant by applying the Kolmogorov-Smirnoff test, and further validated with Student's unpaired t test, when the data were consistent with a normal distribution. Additionally, individual distances for every protein in every virus were compared in pairs, with a paired Student's t test.
Protein chemistry predictions. Hydrophobicity plots were calculated by using the Kyte-Doolitle hydropathicity scale, a main window of 13 aa, and edges of 5 aa (28). The average GRAVY values for the peptides were calculated as the sum of hydropathy values of all the amino acids, divided by the number of residues in the sequence. Topology predictions were performed at the PRED-CLASS server (http://biophysics.biol.uoa.gr/PRED-CLASS) by using cascading neural networks (41). Transmembrane segments were delineated with the TMHMM algorithm at the HUSAR server (http://genome.dkfz-heidelberg.de) by using hidden Markov model prediction (46) and confirmed at the PRED-TMR server (http://biophysics.biol.uoa.gr/PRED-TMR2) by using neural network prediction (40).
| RESULTS |
|---|
|
|
|---|
|
The third branch of the tree (Fig. 1), embraced HPV groups A2, A3, and A4. HPV61 and HPV72 belonged to this branch and are classified as low-risk type, because of their low association with cervical cancer (11, 34). Sequences in this branch range between 370 and 605 bp, and appeared together confidently 800 out of 1,000 times. Each of the groups comprised herein clustered separately with high bootstrap values, and group A4 appears as a subtree within the A3 group (Fig. 1).
The fourth branch of the E2-L2 tree (Fig. 1), covered PV group A10. All members appeared in this branch with high bootstrap values (1,000 out of 1,000 times). This group includes not only HPV but also CPV and PCPV (Fig. 1). The closest human relative of both is HPV13, as also described for the phylogeny of L1 sequences (49). PVs in this branch are classified as low-risk types and are usually associated with nonmalignant, external lesions in the genitalia (20, 39). The E2-L2 region of PVs in this branch ranges between 500 and 600 bp.
The fifth branch of the E2-L2 tree (Fig. 1), was closely related to the fourth branch. It enclosed HPV groups A1 and A8. The low-risk HPV40 and HPV42 appeared in this branch. Sequences in this branch range between 340 and 560 bp and clustered together with high confidence (1,000 out of 1,000 times). Groups A8 and A1 were sharply discerned (Fig. 1). However, HPV54 belongs to group A1 but did not group with HPV32 and HPV42, both members of the A1 group.
Besides the sharp grouping of E2-L2 in five main branches, sequences from CRPV, COPV, and HPV1 also clustered together. These PVs belong to supergroup E PVs, and the common branching was strongly supported despite the large differences in length ranging between 100 bp and 1.6 kbp. This cobranching validated the adequacy of the approach used.
The E2-L2 region of PV codifies six different conserved E5-like proteins. A primary set of putative E5 sequences was built as described above. A preliminary examination of these initial putative E5 sequences showed six main families of evolutionary related proteins. Many other sequences, however, showed no consistent taxonomically distribution and had no close relatives and no obvious similarities with other putative ORFs from members of the same supergroup or group; they branched alone close to a central point of the tree (data not shown). All these sequences were therefore suspected of being spurious translations. In this first stage, we first applied phylogenetic coherence criteria, assuming that phylogenetically close viruses would display phylogenetically close E5-like translations. Therefore, all the phylogenetically scattered protein sequences suspected of being spurious translations were removed, and a new analysis was performed with the remaining 84 sequences. This second sequence set showed a coherent distribution in six protein groups, fine classifications within these groups being coherent with the classification into A and C supergroups (Fig. 2) (9). In this final analysis, some of the previously discarded ORFs were included because the taxonomic diversity of the hosts prevented us to discern between true, distinct E5-like proteins and spurious translations, i.e., Phocoena spinipinnis PV, canine oral PV, cottontail rabbit PV, or rabbit oral PV. Some other translations were further included to highlight the sequence drift proposed above, such as those in rhesus monkey PV. Finally, some sequences termed E5 in the public databases but not matching any of the six groups were included in the final sequence set. This was the case of BPV4 E5a and E5b and of HPV5 E5. Neither PV bears a real E2-L2 segment, and the predicted E5 proteins overlap the corresponding E2 and/or L2 ORFs.
|
Groups A5, A6, A7, A9, and A11 showed a conserved ORF ca. 240 bp in length, starting close to the E2 stop codon but never overlapping it. This ORF encodes a protein named E5. For clarity and due to the lack of homology between the different E5 proteins, we termed it E5
. These E5
proteins are highly hydrophobic membrane proteins, with an average GRAVY index of 1.92 and average Ile+Leu+Val content of 44.2%. E5
proteins clustered together confidently, 1,000 times out of 1,000 (Fig. 2). The genetic arrangement of the E2-L2 region in these PVs is shown in Fig. 3. The best studied of these E5
is HPV16 E5
, which is 83 aa long, has a GRAVY index of 1.79 and shows up to three putative transmembrane domains at aa 11 to 29, 36 to 54, and 59 to 76. A TCOFFEE alignment of the E5
proteins is shown in Fig. 4. Amino acid identities among E5
proteins are scarce, but the global hydropathic pattern, showing three highly hydrophobic regions that could correspond to potential transmembrane regions is conserved in all of them (2). A plot of group A9 E5
proteins showing this hydrophobic profile is given in Fig. 4. The low sequence similarity between E5
proteins is also reflected in their phylogenetic distribution. Thus, they all shared an ancient common ancestor and clustered together 900 out of 1,000 times. However, an early evolutionary split made sequences in groups A9 and A11 diverge from those in groups A5, A6, and A7 (Fig. 2). The initial branching within E5
proteins and the subsequent evolutionary divergence would therefore account for the relatively low sequence homology.
|
|
described. We therefore designate it E5ß. Sequence homology between E5ß proteins is relatively high, as shown in the alignment in Fig. 4. They present one hydrophobic, putative transmembrane region and show an average GRAVY index of 1.24; the average Ile+Leu+Val content is 46.0%. As an example, HPV2 E5ß is 48 aa long, has a GRAVY value of 1.03, and shows one putative transmembrane domain (aa 25 to 42). A simultaneous hydrophobic plot for the E5ß sequences is given in Fig. 4. The global similarities between E5ß proteins can be seen, as they display a hydrophilic N terminus and a putative transmembrane region close to the C terminus.
HPV groups A1, A8, and A10 possess a long E2-L2 region, with ca. 600 bp (Fig. 4). In group A10, the first half of this segment encodes an extremely well-conserved putative protein ca. 90 aa long that we have named E5gamma. Like all E5-like proteins, E5
are highly hydrophobic membrane proteins, with an average GRAVY index of 1.60 and average Ile+Leu+Val content of 46.0%. As an example, HPV11 E5
is 91 aa long, has a GRAVY value of 1.83, and contains up to three putative transmembrane domains (aa 13 to 37, 42 to 61, and 68 to 87). Almost half of the E5
amino acid sequences are identical, and more than 80% residues are similar. The corresponding TCOFFEE alignments and simultaneous hydrophobic plots of the E5
proteins are given in Fig. 4. E5
proteins are therefore highly conserved and are present exclusively in the A10 group, which encompasses PVs infecting humans, chimpanzees, and pigmy chimpanzees. These two facts combined suggest a conserved role for this putative protein in the biology of these viruses.
In the second half of the E2-L2 segment, groups A1, A8, and A10 share a conserved short ORF ca. 150 bp long. We named the putative protein expressed here E5
. HPV6 and HPV11 E5
proteins additionally present an extended C terminus, ca. 30 aa in length. All E5
proteins show a highly hydrophobic, potential transmembrane region of conserved amino acids. The average GRAVY index of E5
proteins is 1.02, and the average Ile+Leu+Val content is 36.1%. As an example, HPV13 E5
is 45 aa long, has a GRAVY value of 0.98, and shows a putative transmembrane domain (aa 11 to 33). Certain stretches of the sequence are extremely conserved, such as the pattern GDXW(L, M)XLW or the hydrophobic box downstream (Fig. 4).
The phylogenetic relationships of these E5
, -ß, -
, and -
proteins; E5a and E5b from ungulates; and some other conceptual translations from PV sequences from the E2-L2 region are depicted in Fig. 2. Each of these proteins clustered separately and confidently, and no closer evolutionary relationship between them could be inferred. This means that if there was a unique ancestor for all of them, it predated the split ungulate-primate group, and it gave rise to six evolutionary pathways leading to six different proteins in a very short time, yielding this star-like pattern of the phylogenetic tree.
The evolutionary pattern of E5-like proteins is different from that of the late proteins L1 and L2, coincides with that of early proteins E6 and E7, and correlates with the clinical manifestations of the viral infection. The HPV E5-like ORFs identified here have been classified into four different groups according to their chemical characteristics and their phylogenetic relationships, and all ORFs carried in the E2-L2 region and suspected of being spurious translations were removed and not analyzed. However, to rule out the possibility that our study dealt only with conceptual translations and had no biological significance, we analyzed the phylogenetic relationships within the early proteins E6 and E7 and the late proteins L1 and L2 in PVs with an E5 gene-like ORF and compared both results. The corresponding protein sequences were aligned by TCOFFEE, and phylogeny was estimated by evaluating the distance matrices after 1,000 cycles of bootstrapping. The corresponding trees for L1 and E6 are depicted in Fig. 5 and 6, respectively. The topology of the trees for L2 and E7 was similar to that of L1 and E6, respectively (data not shown).
|
|
The topologies of the phylogenetic trees for the early proteins E6 and E7 are different from those of L1 and L2 and match the description provided above for E5-like proteins. In these early genes studied, there was an ancient splitting event separating three main branches of mucosal HPVs (Fig. 5). The first one comprised groups A5, A6, A7, A9, and A11. These groups enclose all the PVs identified as high-risk PVs and correspond to those encoding an E5
protein. The second branch included groups A2, A3, A4, and A12 and matches those described as encoding an E5ß protein. Finally, the third branch encompassed groups A1, A8, and A10those groups containing an E5
ORF and also an E5
ORF in the case of the A10 group. E5-like proteins therefore show the same evolutionary topology as the early proteins E6 and E7. This fact reinforces the validity of our identification of four putative types of mucosal HPV E5-like ORFs (genotypes) as real ORFs whose translations could correlate with the malignancy and clinical manifestations of the infection phenotypes.
The overall topology of the evolutionary trees of early and late genes in mucosal HPVs is not superimposable. It can then be inferred that the selection pressures driving the evolution of L1 and L2 proteins and those driving the evolution of the early genes E6 and E7, the E2-L2 segment, and the E5-like proteins are different and have led to different evolutionary paths that can be accurately tracked. There is therefore a different evolutionary pattern, which parallels the different functions of L1 and L2 (involved in the first contact between virus host and in virus assembly and release) and those of E6, E7, and E5 (involved in the early steps of viral infection).
Early proteins E5, E6, and E7 diverged more than the late proteins L1 and L2, and those in high-risk viruses evolved more rapidly than those in low-risk viruses.
Having proven that the evolutionary pattern of HPV early and late genes was different, we addressed the question whether there were also differences in the evolutionary rate between both protein types. We measured and compared the corresponding distances from the present viral proteins to the last common ancestor (LCA) of the group to the LCAs of the clade and the protein. Here we will define the LCA for every clade
, ß, and
as the last common node having given rise to all PVs encoding E5
, E5ß, and E5
proteins, respectively. The position of the putative protein LCA was estimated considering the branching point of the trees giving rise to the corresponding ungulate PV protein. Results are depicted in Fig. 7. When comparing distances from present proteins to group and protein LCAs, the divergence percentage increased in the order L1 < L2 < E6
E7 < E5. When comparing distances from present proteins to clade LCAs, the divergence percentage increased in the order E6
E7 < E5 (Fig. 7). Thus, while present L1 proteins diverged ca. 18% from the putative LCA, L2 proteins diverged ca. 24%, E6 and E7 diverged ca. 30%, and E5 diverged ca. 42%. These differences reflect again that the selection pressures pushing the evolution of early and late genes in mucosal HPVs are different. The dissimilarities in rate evolution between early and late proteins are proportionally the same when looking at the group and protein LCAs (Fig. 7, inset). Thus, assuming that there was a single ancestor for every of the present L1, L2, E5, E6, and E7 proteins and that these common ancestors were contemporary and assuming a constant mutation rate for the HPVs, the early genes have sustainedly evolved more rapidly than the late genes.
|
, E5ß, E5
, and E5
proteins, and the corresponding values were compared. As shown in Fig. 7b, E6, E7, and E5-like early proteins diverged significantly more in high-risk viruses than their counterparts in low-risk viruses, while there are only marginal differences in the evolutionary rate of late proteins. Thus, while E5
has diverged ca. three times faster than the corresponding L1, E5ß, E5
, and E5
evolved only approximately two times faster than the corresponding L1. Similarly, E6 and E7 in high-risk viruses have diverged approximately two times faster than L1, while in low-risk viruses the ratio reaches only ca. 1.5 times the divergence of L1. Thus, both the evolutionary pattern and the evolutionary rate differ between early genes and late genes in mucosal HPVs and also between high-risk and low-risk PVs. | DISCUSSION |
|---|
|
|
|---|
The E2-L2 segment usually encodes short, hydrophobic proteins, named E5 or E5a, E5b, and E5c in PVs containing more than one of these putative ORFs. We propose a classification of the ORFs carried in the E2-L2 region of the mucosal HPVs as a result of having applied two coherence criteria for the E5-like proteins: phylogenetic coherence (phylogenetically close E5 proteins are expected to appear in phylogenetically close viruses) and chemical coherence (phylogenetically close proteins are expected to display similar basic chemical characteristics). Underlying these assumptions is the basic hypothesis that chemistry is the main restriction for protein evolution (3). First, we identified many of these putative proteins as spurious translations, on the basis of their incongruent phylogenetic distribution. We propose therefore that all these ORFs so far designated E5 should not be named as such. The list of ORFs that meet our criteria of chemical and phylogenetic congruence and therefore should be named E5 is provided in Fig. 3. Our results predict that the average divergence between present E5-like proteins and the LCA is more than 40%. On this basis, we propose a change in the nomenclature to sharply designate with different names what, in reality, could be different polypeptides. We therefore suggest that E5-like proteins be named E5
, E5ß, E5
, and E5
. This nomenclature reflects simultaneously the homogeneity of the different proteins regarding their chemistry and their evolutionary patterns and matches the epidemiological characteristics of the different viruses bearing these ORFs.
The phylogenetic trees of the E2-L2 (Fig. 1) and of the E5 (Fig. 2) DNA segments show a star-like pattern. In both trees, the main branches emerge close to a putative central point, and the relative distances between clades are comparable. It could be claimed therefore that we have compared sequences which do not share any common ancestor and that this fact is responsible for the star-like appearance of the final trees. Evidence, however, suggests that all the present E2-L2 mucosal HPV sequences and the true E5 proteins could have shared a common ancestor. The E2-L2 segment could be a hypervariable region in the mucosal HPVs and is therefore likely to have undergone rapid evolution, as well as insertions, deletions, or recombinations (22). The star-like appearance of the phylogenetic tree of the E2-L2 region DNA sequences would therefore reflect such hypervariability. We have further provided additional evidence regarding the relative evolutionary distances between the present E5-like proteins and the respective LCAs, compared with the corresponding distances for other four genes in the PV genome. Concerning the four groups of HPV E5 sequences, we have shown that there is no evident sequence similarity between them and that the evolutionary divergence between present proteins in different groups rises to 80%. The highly hydrophobicity, the high Ile+Leu+Val content, and the presence of transmembrane regions are the only common characters for all E5
, -ß, -
, and -
proteins. Of all E5-like proteins, only the biology of HPV16 E5
is partially known. It localizes mainly in the Golgi apparatus and has been associated with several disconnected effects related to differential response to growth factors and stress, apoptosis initiation, and MHC surface expression (5, 16). These multiple effects could arise from local changes in the membrane chemistry, related to the highly hydrophobic nature of the protein and its transmembrane potential (14). This is the only characteristic common to all E5-like proteins that could account for the multiple effects hitherto associated with them. Experimental data related to other E5 types also point in this direction. Thus, HPV2 E5ß is also a Golgi protein and blocks the surface expression of MHC-II molecules (8). In addition, both HPV6 E5
and HPV11 E5
localize in the Golgi and associate with the 16-kDa pore-forming protein component of the vacuolar ATPase (10, 12), also known to be an interaction partner of HPV16 E5
(1, 12). The overall data suggest, therefore, that mucosal HPV E5 proteins share a common ancient ancestor and that they underwent a rapid early divergence process that gave rise to the present four E5 families. The particular composition of the E5 proteins, where the three amino acids Ile, Leu, and Val (representing 13 possible codons) account for more than 45% of the sequence, could have eased the sequence drift here proposed.
The E5-like proteins display the same evolutionary characteristics as E6 and E7. The phylogeny of human mucosal PV, according to L1 and L2, is the same and matches previous reports (9, 22, 38, 50, 51) but does not coincide with that of the early genes. The strong correlation between phylogeny and epidemiology in all the early proteins studied is absent in the corresponding analysis for the late proteins L1 and L2. This fact shows that the structural proteins L1 and L2 have a secondary role, if any, in the malignant transformations associated with viral infection.
The divergence rate at the protein level increases in the progression L1 < L2 < E6
E7 < E5. There is, therefore, a clear gradient in the rate of divergence from late genes, which evolve more slowly, to early genes, which evolve more quickly. In the same direction, the divergence rate of the different E5-like proteins followed the progression E5
< E5ß
E5
< E5
. This reinforces again our proposal that the E5-like proteins here identified are real proteins and that there is a correlation between the E5 version encoded in a given PV genome and its higher or lower association with the development of neoplasia.
The findings that early proteins have diverged more than late proteins and that early proteins in high-risk viruses have evolved more than early proteins in low-risk viruses match with the involvement of early proteins in the initial transformation processes of the viral infection (17). The expression of E6 and E7 modifies the normal cell cycle and alters the differentiation program of the keratinocyte, thus allowing viral DNA replication. E6 and E7 initially bind p53 and retinoblastoma protein p105RB, respectively, although both are known to have other cellular targets (17, 52). The expression of E5, on the other hand, raises a multitude of apparently disconnected effects that enhance those of E6 and E7 (5, 16) and which could arise from a modification of cell membrane chemistry (15). The cellular binding partners of L1 and L2 are still unknown, but it can be inferred from our results that they will not be involved in cellular homeostasis to the same extent as those of E6, E7, and E5. The increased divergence rate in early genes, especially in high-risk PVs, could have arisen as a result of a coevolutionary arms race between virus and host. In the case of E5, the high hydrophobic content would have potentiated the divergence. A complementary view of the increased divergence rate of early genes compared to late genes could explain this fact as a reflection of a high number of interaction partners of these early proteins. Thus, the higher the number of interaction partners of a protein, the broader its effects are and the higher its divergence rate will be. This view would match a scenario where the number of interaction partners and the multiplicity of biological effects on the infected cell also increase in the sequence L1 < L2 < E6
E7 < E5.
E5-like proteins can be classified into four groups according to their chemical characteristics and evolutionary relationships. This classification matches the epidemiological characteristics of the mucosal HPVs and their differential association with cancer development (11, 34). Moreover, the evolutionary pattern and divergence rate of the E5 proteins agree with those of the early genes E6 and E7, but not with those of the late genes L1 and L2. To date, most of the data available refer to the E5
protein, and few reports are available about the biological effects of E5ß, E5
, and E5
. The different evolutionary history of the early and the late genes raises the question of which gene (if any) reflects the true evolutionary history of the PV; it does not exclude the presence of an initial period where recombination and horizontal exchange of genetic material between viruses could have been possible. Finally, the properties here analyzed and predicted for these proteins suggest that their characterization could provide us with new insights into the biology and the diversity of clinical manifestations of the PV infection in humans.
| ACKNOWLEDGMENTS |
|---|
I.G.B. is the recipient of a Fundación Ramón Areces postdoctoral fellowship.
| FOOTNOTES |
|---|
| REFERENCES |
|---|
|
|
|---|
This article has been cited by other articles:
| ||||||||||||