| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Previous Article | Next Article ![]()
Journal of Virology, May 2007, p. 4564-4571, Vol. 81, No. 9
0022-538X/07/$08.00+0 doi:10.1128/JVI.02104-06
Copyright © 2007, American Society for Microbiology. All Rights Reserved.

Romina Oliva,2,#
Anna Tramontano,2,3
Agostino Cividini,1,4
Milvia Casato,5
Giampaolo Merlini,6
Enrico Silini,7 and
Mario U. Mondelli1,4*
Area Infettivologica e Centro di Epatologia, Fondazione IRCCS Policlinico S. Matteo, Pavia, Italy,1 Dipartimento di Biochimica, Università La Sapienza, Roma, Italy,2 Istituto Pasteur-Fondazione Cenci Bolognetti, Università La Sapienza, Roma, Italy,3 Dipartimento di Malattie Infettive, Università di Pavia, Pavia, Italy,4 Dipartimento di Medicina Clinica, Università La Sapienza, Roma, Italy,5 Area di Biotecnologie e Tecnologie Biomediche, Fondazione IRCCS Policlinico S. Matteo and Università di Pavia, Italy,6 Dipartimento di Patologia e Medicina di Laboratorio, Università di Parma, Parma, Italy7
Received 26 September 2006/ Accepted 13 February 2007
| ABSTRACT |
|---|
|
|
|---|
| INTRODUCTION |
|---|
|
|
|---|
In an effort to identify HCV-specific motifs potentially associated with the development of cryoglobulinemia, a prototypical B-cell lymphoproliferative disorder, a recent study reported a high prevalence of a single amino acid insertion at position 385 within HCV hypervariable region 1 (HVR1), which was found exclusively in samples from patients infected with HCV genotype 1b with symptomatic cryoglobulinemia (14). However, these provocative data were not confirmed in two subsequent studies (17, 24), and in only one of them were two positions within HVR1 and three within HVR2/CD81 associated with the presence of cryoglobulinemia (17). Unfortunately, these studies suffered from small sample sizes, and in addition, with one exception (24), only patients infected with genotypes 1a and 1b were examined (14, 17). In the present study we analyzed, by using statistical and bioinformatics approaches, a large cohort of patients with HCV infection caused by genotypes 1 and 2, with and without detectable cryoglobulins, for the presence of specific motifs within HVR1. We also analyzed the entire E2 sequence in samples from a subset of patients, to verify whether properties of other regions of the protein could be correlated with the phenotype.
| MATERIALS AND METHODS |
|---|
|
|
|---|
Informed consent to donate a blood sample was obtained from all patients. Patients' blood samples were processed according to a rigorous protocol to maintain a constant temperature of 37°C, to avoid coprecipitation and partitioning of virions with cryoglobulins, which may interfere with HCV RNA amplification (1). Sera were immediately frozen in sterile cryovials and stored at 80°C until used, as previously described (7). To insure complete solubilization of cryoglobulins upon thawing, sera were maintained at 37°C for 30 min prior to RNA extraction with a QIAamp viral RNA mini kit (QIAGEN, Valencia, CA).
The study protocol conformed with the ethical guidelines of the 1975 Declaration of Helsinki and was specifically approved by the Institutional Review Board and Ethical Committee of Fondazione IRCCS Policlinico San Matteo, Pavia (Coordinating Center) on 28 June 2004.
Amplification of HVR1 in the E2 region of HCV genotypes 1b and 2a/c by PCR, cloning, and sequencing. E2-HVRI sequences were amplified by nested reverse transcription-PCR using AmpliTaq DNA polymerase (Applied Biosystems) as previously reported (5). Two sets of primers were used for separate amplification of the genotype 1b and 2a/c sequences, yielding final DNA fragments of 176 bp, spanning nucleotides (nt) 1428 to 1603, numbered according to the sequence of reference strain H77 (GenBank accession no. AF009606) in agreement with the recommendations of an international panel of experts (29). The genotype-1b HVR1 primer set was Outer AS (nt 1612 to 1633) 5'-TCATTGCAGTTCAGGGCAGTCC-3', Outer S (nt 1395 to 1413) 5'-CACTGGGGAGTCCTGGCGG-3', Inner AS (nt 1584 to 1603) 5'-TGCCAGCTGCCGTTGGTGTT-3', and Inner S (nt 1428 to 1447) 5'-TCCATGGTGGGGAACTGGGC-3'. The genotype 2a/c primer set was Outer AS (nt 1614 to 1634) 5'-GTCATTGCAATTCAGGGCAGT-3', Outer S (nt 1395 to 1414) 5'-CACTGGGGCGTGATGTTTGG-3', Inner AS (nt 1585 to 1603) 5'-TGCCAACTGCCATTGGTGT-3', and Inner S (nt 1428 to 1446) 5'-TCCATGCAGGGAGCGTGGG-3'.
The amplification products were analyzed by gel electrophoresis, purified by using a QIAquick PCR purification kit (QIAGEN, Valencia, CA), and cloned using a TA cloning kit (Invitrogen, Leek, The Netherlands). Sequencing was performed on PCR products from at least 10 to 12 clones per patient, for a total number of 1,207 clones, by using the Big Dye terminator cycle sequencing kit on an ABI 310 genetic analyzer (Applied Biosystems, Foster City, CA) according to the manufacturers' instructions. The deduced amino acid sequences of 58 residues from all clones were aligned with published HVR1 sequences from HCV genotypes 1b and 2a/c and examined for mutations, insertions, or deletions and for specific amino acid motifs. From the 1,207 clones, 449 unique, nonrepetitive HVR1 sequences were obtained.
Denaturing polyacrylamide gel electrophoresis of E2-HVR1 nested-PCR products. The HVR1-amplification products of 176 bp obtained after the second round of nested PCR were immediately analyzed using a rapid screening method that would solve differences of at least 3 nt, i.e., insertions or deletions of 1 or more amino acids (aa). For optimal separation of the PCR amplicons, a 5% denaturing polyacrylamide gel electrophoresis (PAGE) solution was set up as follows: 3.75 ml acrylamide/bisacrylamide (19:1 vol/vol) 40% (Sigma-Aldrich, St. Louis, MO), 15 g urea (Fluka; Sigma-Aldrich), 3 ml of 10-fold-concentrated Tris-borate-EDTA buffer (Invitrogen), 150 µl 10% ammonium persulfate (Sigma, Sigma-Aldrich), 20 µl N,N,N',N«-tetramethylethylenediamine (Sigma, Sigma-Aldrich) in a final solution of 30 ml. The gel solution was poured in a layer of 0.4 mm between two glass plates measuring 25 x 20 cm. A small amount of each PCR product was loaded and run near a specific molecular marker 176 bp in length (representing wild-type HVR1) and markers of 176 bp plus multiples of 3 nt. The electrophoresis was run for approximately 90 min at 950 V, and the cDNA bands were visualized by silver staining.
PCR titration. Since previous findings suggested that a significant number of patients with cryoglobulinemia showed a 1-aa insertion at position 385 within HVR1, two fragments of different lengths, 176 bp and 179 bp, were ligated into a vector using the TA cloning kit (Invitrogen, Leek, The Netherlands). To assess the sensitivity of our PCR assay for detecting insertion/deletion variants of HVR1, gel titration experiments were performed using variable starting ratios of a wild-type (176 bp) and an insertion variant (179 bp) cloned sequence. The mixture was amplified by PCR and loaded on a denaturing PAGE gel under the same conditions mentioned above. The intensities of the two bands of 176 bp and 179 bp obtained from the PCR were subsequently analyzed. These experiments showed that it was possible to detect a 179-bp variant within a pool of 30 wild-type sequences. The ratio of staining intensities of wild-type and variant HVR1 from each patient was used to assess the number of HVR1 clones for a model to appropriately represent each pool of sequences.
Amplification by PCR, cloning, and sequencing of complete E2 of HCV. The complete envelope 2 region (E2) region of HCV was also analyzed for 58 of the patients: 29 patients with genotype 1b (21 with and 8 without detectable cryoglobulins) and 29 patients with genotype 2a/c (21 with and 8 without detectable cryoglobulins). Two primer sets spanning the entire E2 region were used to separately amplify genotype 1b and 2a/c sequences, yielding final PCR products of 1,170 nt for genotype 1b and 1,192 nt for genotype 2a/c (nt 1428 to 2579; numbered according to reference sequence H77, GenBank accession no. AF009606). The genotype 1b E2 primer set was Outer AS (nt 2637 to 2659) 5'-CAGAAGAACACAAGGAAGGAGAG-3',Outer S (nt 1395 to 1413) 5'-CACTGGGGAGTCCTGGCGG-3', Inner AS (nt 2574 to 2597) 5'-CACCAGGTTCTCTAAGGCGGCCTC-3', and Inner S (nt 1428 to 1447) 5'-TCCATGGTGGGGAACTGGGC-3'. The genotype 2a/c primer set was Outer AS (nt 2661 to 2682) 5'-GACCTTTAATACACCAAGCGGC-3' and 5'-GACCCTTGATGTACCAAGCA(T)GC-3', Outer S (nt 1395 to 1414) 5'-CACTGGGGCGTGATGTTTGG-3', Inner AS (nt 2586 to 2607) 5'-CATGCAAGAT(C)GACCAG(A)CTTCTC-3', and Inner S (nt 1428 to 1446) 5'-TCCATGCAGGGAGCGTGGG-3'.
The PCR products were cloned as above and approximately 8 clones for each of the 58 patients were isolated and sequenced, for a total of 449 sequences which yielded 269 unique E2 amino acid sequences (about 4 to 5 different clones for each patient). Of these, 114 sequences were extracted and added to the previous 449 unique HVR1 sequences obtained with the HVR1 primer sets, giving a total of 563 different HVR1 sequences; 548 of these were used for bioinformatics analysis as described below.
Statistical and bioinformatics analysis. We analyzed both the complete data set including all available nonredundant protein sequences and a reduced subset including 111 representative HVR1 sequences, one for each patient studied (data from 2 patients with cryoglobulinemia infected with genotype 1b were lost during electronic transfer of the database). The latter was constructed using, for each patient, the sequence with the highest sequence identity to each other sequence from the same patient.
Fisher's exact test was employed to examine whether there was any position within HVR1 that showed a composition that differed statistically between the sets of positive and negative data, i.e., to discriminate between patients with and without cryoglobulinemia; between the sets of data for genotypes 1b and 2a/c; and between patients with and without cryoglobulinemia within each genotype. The Fisher exact test, applied to two independent samples, was used to provide a measure for the probability that the data belonged to the same distribution. The k-means clustering method, with k varying from 1 to 5, was employed to detect clusters of sequences that discriminated significantly between patients with and without cryoglobulinemia. To determine whether the sequences could be divided into families by searching for positions that had a specific distribution in some members of our data set, the tree determinant-residue identification (Treedet) method (11), which can detect such cases on the basis of a statistical analysis of multiple-sequence alignments, was used. Correlations of mutations were also sought in the alignment of positive and negative samples, in order to see whether second-order effects could be responsible for the phenotype, i.e., whether there was any pair of positions that would vary in a correlated fashion in each of the two data sets (sequences from cryoglobulinemic versus noncryoglobulinemic patients), with the aim of comparing them (15). Principal component analysis (PCA) (9) was also applied to the frequency table obtained from our data. This mathematical procedure transforms a number of (possibly) correlated variables into a (smaller) number of uncorrelated variables called principal components. The first principal component accounts for as much of the variance in the data as possible, and each succeeding component accounts for as much of the remaining variance as possible. With this approach, it is possible to examine how many and which of the input variables are noncorrelated and can therefore be used to best separate the data. Each data set was first considered as a single string obtained by concatenating the frequency of occurrence for all 20 amino acids in the 27 positions. The first PCA components should indicate which independent variables, i.e., position and amino acid, can better separate the positive and negative samples. A PCA analysis was also performed on the frequency tables for each of the 27 positions in the two data sets. A sequence logo representation of the positions that appear more distant in the PCA analysis (see Results) was obtained by using Weblogo version 2.8.2 (http://weblogo.berkeley.edu/) (10, 27) for the 350 sequences from patients with and the 198 sequences from patients without detectable cryoglobulinemia, using default parameters and a bitmap resolution enhanced to 600 dpi. Phylogenetic trees were constructed using the neighbor-joining method (26) implemented in the Phylip package, version 3.66 (12). One hundred trials of bootstrap analysis were performed.
Nucleotide sequence accession numbers. All 563 nonrepetitive HVR1 nucleotide sequences have been submitted to GenBank and were assigned accession nos. EF198910 to EF199472.
| RESULTS |
|---|
|
|
|---|
These results were confirmed by cloning and sequencing (see below). PAGE was also tested as a possible approach for the rapid identification of the previously reported insertion at position 385, and Fig. 1 shows representative findings from three patients with HVR1-length polymorphisms and 10 with wild-type HVR1 sequences. Sequencing of 10 to 12 clones from each patient also revealed a correspondence between the intensity of the extra bands and the relative number of clones carrying the insertion, although the sequencing data were obviously more accurate.
|
|
In summary, 1,656 clones from 113 patients were sequenced; 1,207 sequences were amplified using HVR1-specific primers, whereas 449 were obtained after amplification with primers of the entire E2-region sequence. From these, 548 nonrepetitive HVR1 sequences from 111 of the 113 patients enrolled (15 sequences from 2 patients with cryoglobulinemia infected with genotype 1b were lost during electronic transfer of the database) were extracted and subjected to further analysis. This extensive analysis of a large number of HVR1 sequences revealed the occurrence of insertions in sequences from 6.2% of cryoglobulinemic patients and from 9.1% of noncryoglobulinemic ones. The occurrence was higher in sequences from patients with genotype 2a/c (6.5% of those with and 11.8% of those without cryoglobulins) than in sequences from patients infected with genotype 1b (5.9% with versus 6.2% without cryoglobulins, respectively). There was no correlation between the presence of mutations within HVR1 and viral load, serum alanine aminotransferase and gammaglobulin, patient age, sex, and the presence or absence of symptoms of cryoglobulinemia.
Analysis of the hydropathicity profile (18) of HVR1 revealed a hydrophobicity plot common to all sequences of both genotype 1b and 2a/c, with a variable N-terminal region and a more conserved C-terminal region, in agreement with previously published findings showing conformational conservation of HVR1 (22 and data not shown).
Complete E2-region sequences. Complete sequences of the E2 region (aa 384 to 746) were derived from 58 patients. A total of 449 clones were sequenced, 269 of which showed amino acid changes in the E2 region (mean of 4.5 clones per patient). The sequences were aligned and compared to the published HCV genotype 1b (J, BK, and H, the latter having a recognized higher affinity for CD81) and 2a/c (BEBE1/2c and JCH-2/2a) strains. Amino acid changes within E2 were compared for the two groups of patients with and without detectable cryoglobulins. A single amino acid insertion (proline) was found at position 576 for a single patient (patient 162) with cryoglobulinemia. As expected, conserved and nonconserved amino acid substitutions were found in the E2 region, predominantly within HVR1 and HVR2 and between these two regions, but no clustering of changes was observed for cryoglobulinemic patients. In particular, specific sequence changes within the CD81-binding site were not observed for cryoglobulinemic subjects. Conserved regions previously described by others were confirmed (the WHY motif, the 502-to-520-aa region, and all cysteine residues).
Analysis of HVR1 region by bioinformatics tools. The aim of this analysis was to evaluate whether a correlation exists between a cryoglobulinemic phenotype in patients with chronic HCV infection and changes within the HVR1. The large set of sequences from 111 of the 113 patients enrolled in this study and the subset containing only one representative sequence per patient (see Materials and Methods) were investigated using several methods. It is important to emphasize that our data set contained a substantial number of sequences, so that our results cannot simply be ascribed to sampling limitations. As a first step, we performed Fisher's exact test to determine whether there was any position within HVR1 that showed a composition that differed statistically between the sets of positive (i.e., sequences from cryoglobulinemic patients) and negative (i.e., sequences from noncryoglobulinemic patients) data. For the sake of comparison, the same analysis was performed to discriminate between patients infected with genotypes 1b and 2a/c, and, within each genotype, between patients with and without cryoglobulinemia. The F-test, when applied to two independent samples, provides an estimate of the probability that they belong to the same distribution. All the positions have probability values well above the significance threshold (5% divided by the 27 comparisons performed) (Fig. 2); thus, the compared samples cannot be assigned to different distributions.
|
Next, we performed several PCAs (see Materials and Methods). In our setting, the first component was sufficient to explain most of the variance of the data and none of the variables could be considered significantly more discriminatory than the others. We report here the most meaningful ones, with the caveat that they are not statistically more significant than the others: A in position 493, N in position 384, Y in position 386, G in position 398, K in position 410, P in position 405, S in position 391, T in position 384, and T in position 396. We also repeated the PCA on the frequency tables for each of the 27 positions in the two data sets (sequences derived from samples from cryoglobulinemia-positive and -negative patients). In this case, several components (at least five) need to be used to explain 80% of the variance of the data. The positions that appear more distant between the two sets of data in the PCA (i.e., the positions where the amino acid composition differs more) are 384, 386, 389, 392, 396, 397, 398, 399, and 405. We concentrated on these positions and manually analyzed the frequency matrix. As can be seen from Table 2 and Fig. 3, the differences between negative and positive sequences in these positions were not statistically significant, and specifically, there was no case where the composition was completely nonoverlapping between positive and negative samples. Therefore, even if this effect were significant, it would not have sufficient predictive power to identify patients with cryoglobulinemia.
|
|
|
| DISCUSSION |
|---|
|
|
|---|
In consideration of these negative findings, we set out to analyze the possible existence and significance of HVR1-sequence-specific motifs that would discriminate between patients with and without cryoglobulinemia. Using several bioinformatics approaches, no statistically significant differences were found, indicating that HVR1 sequencing is not useful for the identification of a cryoglobulinemic phenotype. Our findings are in contrast with those of Hofmann and coworkers (17), who reported that positions 386, 387, and 396 within HVR1 could be predictive of cryoglobulinemia in a rather small number of patients with detectable cryoprecipitate. However, the authors used a correlation coefficient analysis, which is less sensitive than Fisher's exact test, and a classifier that is not described in detail, which may have influenced their conclusions. Our data do not support these results and lead to the conclusion that the observed differences likely reflect biases in their data set. Our data are instead comparable with those of Rigolet and associates (24), who failed to detect molecular features associated with B-cell clonal expansion and cryoglobulinemia. In this study, phylogenetic analysis did not reveal clustering associated with lymphoproliferative disorders, nor were the N-terminal insertions or residues at positions 4 and 13 discriminative for such conditions. Moreover, the high frequency of the insertion at position 385 of the HVR1 deduced amino acid sequence could not be ascribed to genotype 1, since in our study we examined more than twice as many patients infected with the same genotype, from whom a large number of cloned sequences were derived, and found only a low prevalence of insertions in that position. The frequency of changes in sequences from genotype 2-infected patients was higher than that found in sequences from genotype 1, particularly in clones derived from patients without cryoglobulinemia. These findings most likely reflect a stronger positive selection of HVR1 variants in patients infected by genotype 2 than in those infected by genotype 1, as reported previously by our group (6, 21). Evidence supporting the contention that HVR1 may be under positive selection is currently derivative rather than direct, although we are not aware of alternative evidence supporting random variation or genetic drift as being responsible for the emergence of HCV variants. Supportive evidence for HVR1 being under selective pressure in the context of a relatively stable HCV replication rate comes from the following: (i) rapid and continuous build-up of quasispecies within individual patients; (ii) higher variation in HVR1 than in other HCV regions; (iii) HVR1 evolutionary rates that differ among patients; and (iv) a higher dN/dS (nonsynonymous mutation/synonymous mutation) ratio within HVR1 than in other regions. Studies on the pathogenetic mechanisms responsible for variant selection in this setting emphasized the importance of B-cell responses (21, 31) in the active selection process, in agreement with a study which showed lower nucleotide and amino acid variation in HCV isolates from patients with genetic immunoglobulin defects than in isolates from control HCV patients (5), but failed to provide a plausible biological explanation for this finding beyond the well-recognized associations of genotype 2 with mild liver disease (23, 28) and better responses to antiviral treatment (3).
According to the results of our analysis, and because of the large number of sequences examined, it is highly unlikely that HVR1-sequence-specific motifs are associated with an initial lymphoproliferative disorder such as that associated with cryoglobulinemia. Recently, significant compartmentalization of HVR1 variants was observed in the peripheral blood mononuclear cells and cryoprecipates from a small number of patients with cryoglobulinemia (32). Interestingly, a large insertion of 5 aa encoded by codons 385 to 389, akin to that from 1 of our patients, was detected for 1 of the 10 patients with cryoglobulinemia that were studied. Also, phylogenetic analysis of the HVR1 quasispecies revealed a significantly greater distance between variants for genotype 2-infected patients than for those infected with genotype 1, irrespective of the presence or absence of cryoglobulins, in agreement with our previously published data (6). Although our study was not specifically designed to examine differences between HVR1 quasispecies compositions in the supernatants and cryoprecipitates, an issue which was carefully addressed in other studies such as that mentioned above (32), the methodological approach we followed with this large series of patients allowed for the amplification of variants representative of the entire serum virus population.
Analysis of the entire E2 sequence for a proportion of patients also failed to establish a link between HCV envelope motifs and cryoglobulinemia, suggesting that, at least in serum, no clustering of specific mutations, including potentially important regions such as the CD81-binding site, exists in this setting.
Why, then, do over two-thirds of patients with chronic HCV infection develop B-cell lymphoproliferative disorders which, in a proportion of cases, may be clinically relevant? It is conceivable that the initially nonneoplastic monoclonal expansion characteristic of this condition may be a consequence of protracted antigenic stimulation, as frequently observed in several chronic viral infections (30). This phenomenon is thought to occur via the engagement of a widely distributed tetraspanin, CD81, with the HCV E2 protein, which would reduce the B-cell activation threshold (25), providing a plausible explanation for the antigen-dependent autoantibody production and B-cell clonal expansion which may occur as extrahepatic manifestations of chronic HCV infection.
In summary, extensive analysis of the HCV E2 envelope protein and, in particular, of HVR1, which is known to be the target of immune selection contributing to the characteristic HCV quasispecies distribution, failed to reveal sequence-specific motifs which were allegedly associated with cryoglobulinemia. Since B-cell activation appears to be a general feature of HCV infection (25 and our own unpublished data), B-cell lymphoproliferative disorders and the related inappropriate generation of cryoglobulins may arise following as-yet-unidentified host rather than virus-specific factors. Sequence variation in the HVR1 region of HCV most likely results from Darwinian selection regulated by the host immune response. HVR1 sequence changes are therefore secondary to the establishment of a specific immune selection. Skewing in HVR1-sequence distribution in cryoglobulinemia may reflect skewed immune responses, which probably follow rather than precede the establishment of pathological B-cell monoclonal proliferation.
| ACKNOWLEDGMENTS |
|---|
We thank Stefania Varchetta, Area Infettivologica, Fondazione IRCCS Policlinico San Matteo, Pavia, Italy, for help with submission of cDNA sequences to the NCBI database; Massimo Cugno, Department of Internal Medicine, University of Milan, for providing control sera; and Lara Firmo for editorial assistance.
| FOOTNOTES |
|---|
Published ahead of print on 21 February 2007. ![]()
Present address: Dipartimento di Scienze Farmaceutiche, Università degli Studi di Firenze, Firenze, Italy. ![]()
# Present address: Dipartimento di Scienze Applicate, Università degli Studi di Napoli "Parthenope," Napoli, Italy. ![]()
| REFERENCES |
|---|
|
|
|---|
This article has been cited by other articles:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| J. Bacteriol. | Mol. Cell. Biol. | Microbiol. Mol. Biol. Rev. |
|---|
| Clin. Vaccine Immunol. | ALL ASM JOURNALS |
|---|