Previous Article | Next Article ![]()
Journal of Virology, October 2005, p. 12674-12680, Vol. 79, No. 20
0022-538X/05/$08.00+0 doi:10.1128/JVI.79.20.12674-12680.2005
Copyright © 2005, American Society for Microbiology. All Rights Reserved.
Margaret May,2,
Raquel Martinez,1
Pascal Meylan,1
Jürg Ott,3
Jacques S. Beckmann,4
Amalio Telenti,
,1* the Swiss HIV Cohort Study
Institute of Microbiology, University of Lausanne, Switzerland,1 Department of Social Medicine, University of Bristol, United Kingdom,2 Laboratory of Statistical Genetics, Rockefeller University, New York, New York,3 Medical Genetics, University Hospital of Lausanne, Switzerland; and the Swiss HIV Cohort Study4
Received 2 March 2005/ Accepted 20 July 2005
|
|
|---|
|
|
|---|
We addressed these issues by testing whether a two-step screening by an ex vivo/in vivo model of population genetics would allow identification and validation of novel host genetic variants influencing HIV-1. For this, we infected purified CD4 T cells from healthy blood donors and extracted DNA for detailed resequencing of candidate genes. Alleles identified as possibly modifying cellular permissiveness ex vivo were thereafter assessed in a cohort of HIV-1-infected individuals. Finally, selected alleles of gene candidates and of previously reported genes influencing HIV-1 infection were included in a multigene model to define their contribution to global prediction of HIV-1 disease progression.
|
|
|---|
Candidate genes, identification of SNPs, and allelic discrimination.
The genes involved in the HIV-1 life cycle selected for analysis included TSG101 (GenBank accession no. U82130), ßTRC (Y14153), PPIA (X52851), INI1 (AJ011737), NAF1 (AY012155), PML/TRIM19 (X63131), HP68 (X76388), YY1 (Z14077), and AIP1/ALIX (AF151793). Single nucleotide polymorphism (SNP) discovery used single-strand conformation polymorphism and sequencing. For this, a total of 138 PCRs were designed to cover exons, putative promoter regions, and intron-exon boundaries. Selected positions were then genotyped by using TaqMan allelic discrimination (primer information available on request). The following previously reported host genetic variants influencing HIV-1 disease progression were investigated by TaqMan or restriction fragment length polymorphism analysis (primers and probes available from the authors): CCR5
32 (7, 38), CCR5 promoter 59029G>A (28, 30), CCR5 coding region 303T>A (37), CCR2 coding region 64I (40), CX3CR1 coding region 280M (8), RANTES promoter region 403G>A (25), RANTES promoter region 28C>G (16, 25), RANTES intron 1.1 (1), MIP1
intron 1 459C>T (16), and IL-4 promoter 589C>T (35). Three additional alleles, SDF1 3'A (50) and IL-10 3575T>A and 592C>A (39), were assessed only in vivo. SDF1 is a ligand of CXCR4 and, thus, of limited relevance in an ex vivo analysis that used an R5-tropic strain. Although IL-10 is produced by Th2 CD4 T cells, the variants were not analyzed ex vivo because this cytokine is expected to act through macrophages and other cells not present in the purified CD4 T-cell population. The ex vivo viral replication for each genotype was represented by the median p24 antigen production.
In vivo analysis: characterizing phenotype by CD4 cell count decline. Participants (n = 851) were recruited within the genetics project of the Swiss HIV Cohort Study (SHCS) (http://www.shcs.ch/). The ethics committees of all participant centers approved the study. Patients gave written informed consent for genetic testing. DNA from peripheral blood mononuclear cells from participants was used for genotyping. The purpose of the in vivo analysis was to allocate a reliable phenotype that would be a marker of disease progression before treatment for each patient. The rate of decline in the CD4 T-cell count during the natural history of disease progression was considered to be the most appropriate marker to use as a phenotype. Patients who had at least two CD4 measurements before exposure to potent antiretroviral therapy (ART) were included in the analysis. Time origin for the CD4 T-cell measurements was the estimated date of seroconversion. This was calculated for each patient using a method proposed by Geskus et al. (14), which was applied to the entire SHCS. The method matches the first CD4 T-cell measurement of a patient with an unknown date of seroconversion with the measurements from seroconverters and uses kernel density estimation to infer the most likely date of seroconversion. The CD4 T-cell trajectories were modeled using a repeated measures hierarchical approach using Mlwin software (http://multilevel.ioe.ac.uk). Square-root-transformed CD4 T-cell counts were modeled as a linear function of time from the estimated date of seroconversion, with random effects for both the intercept and the gradient with additional terms for sex, age (16 to 29, 30 to 39, 40 to 49, 50 years, and above), and risk group (intravenous drug use [IDU]/non-IDU). For each genotype, the average square root CD4 decline per year was estimated in dominant and recessive models. Haplotypes were attributed using SNPHAP (http://www-gene.cimr.cam.ac.uk/clayton/software/). The correlation between the ex vivo phenotype of viral permissiveness and the in vivo phenotype of CD4 decline was calculated as the correlation between the coefficient from the ex vivo regression with the difference in the square-root-transformed CD4 gradient from the in vivo analysis comparing those with rare allelic presence with the common homozygous allelic group. Concordance between ex vivo and in vivo results was inspected graphically.
Multiple gene effects model. In order to estimate the combined effect of all the polymorphisms on CD4 T-cell decline, models were fitted using (i) only polymorphisms previously reported to affect disease progression and (ii) only candidate polymorphisms proposed in this paper. Finally, a stepwise procedure was used on the full model to find an optimal model by omitting the least significant term and comparing model deviations at each iteration. All models were adjusted for sex, IDU, and age and were estimated using all available pretreatment CD4 counts. Models were compared to the null model without genetic terms using the likelihood ratio test. Confidence intervals (95%) were calculated for the predicted difference in latency period due to genetic effects using the percentile simulation method with 100,000 iterations. The prognostic model was used to estimate the combined effect on disease progression of the differences in CD4 decline according to different genotypes. In order to illustrate the clinical importance of the combined effect, we estimated the range of times for the CD4 count to drop from 500 to 200 cells/µl.
|
|
|---|
DNA from blood donors was used to identify genetic variation in nine candidate genes participating in the life cycle of HIV-1. These included TSG101 (encoding tumor susceptibility gene 101), participating in viral budding through interaction with viral protein p6Gag (13); ßTRC (beta-transducin repeats-containing protein), interacting with viral protein Vpu to bring CD4 to the endoplasmic reticulum degradation pathway (27); PPIA (peptidyl-prolyl cis-trans isomerase; cyclophilin A), incorporated into the viral particle through selective interaction with viral capsid (11); INI1 (integrase interactor 1 protein), participating in viral genome transcription (46); NAF1 (Nef-associated factor 1), interacting with viral proteins Nef and matrix (12, 19); PML (TRIM19; promyelocytic leukemia), proposed to act as antiviral protein (46); HP68 (RNase L inhibitor protein) associating with Vif and Gag (52); YY1 (transcription factor ying yang 1), implicated in the down-regulation of the expression of chemokine receptors and of long terminal repeat promoter of HIV-1 (26); and AIP1/ALIX, (mammalian orthologue of the yeast class E vacuolar protein sorting factor, Bro1), a component of the viral budding machinery, which serves to link a distinct region in the L domain of HIV-1 p6 to ESCRT-III (42). These genes are present in different chromosomes (see Fig. S1 in the supplemental material), and thus there is no linkage.
SNP screening was performed on 96 chromosomes through single-strand conformation polymorphism and subsequent sequencing analysis of a cumulative 37,406 nucleotides per individual. For specific SNPs, analysis was extended to 256 chromosomes. Amplification success rate was 89% (33,262 nucleotides screened per donor; 37% coding, 27% promoter and 3' untranslated region, and 36% intron-exon boundaries). A total of 34 SNPs/indels (insertion-deletions) were identified (one variant per 978 nucleotides), of which 23 were newly identified and/or validated at the time of the study (see Fig. S1 in the supplemental material). SNP frequencies ranged from 1 in 6,185 in HP68 to 1 in 333 nucleotides for PPIA. We chose, for the initial steps of marker discovery, to perform association analysis on individual SNPs, not on inferred haplotypes. However, for some associations, particularly those newly detected, the precise combination of SNPs that confers the causal effect has not been established, and observations about individual SNPs are likely to be modified with further study. If a candidate SNP is not causal but is suspected to be marking a linked variant, this analysis would be followed by detailed resequencing of the region containing the associated SNPs (41).
Association of genetic variants with viral replication ex vivo. Results are presented in detail in Table S1 in the supplemental material, and relevant results are shown in Fig. 1. TSG101 183T>C, ßTRC A507S, and PML 225C>T presented statistical significance or trend (P < 0.1, significance level defined post hoc), and profiles suggestive of dominant (TSG101 and ßTRC) or recessive (PML) influences. We also present PPIA 1650A>G in Fig. 1A, because of a suggestive profile upon visual inspection. The lack of association with differences in CD4 T-cell permissiveness for all other genetic variants of candidate genes is presented in Table S1 in the supplemental material.
![]() View larger version (20K): [in a new window] |
FIG. 1. Association of candidate and known alleles with HIV-1 cell permissiveness ex vivo. (A) Candidate genes. (B) A selection of previously reported host genetic variants influencing HIV-1 infection. Bars represent median values. Shown are P values estimated by a Kruskal-Wallis test and by a Spearman rank test for trend.
|
32, CCR5 59029G>A, CCR5 303T>A, CCR2 64I, CX3CR1 280M, RANTES 403G>A, RANTES 28C>G, RANTES intron 1.1, MIP1
459C>T, and IL-4 589C>T (Fig. 1B) (see Table S1 in the supplemental material). There was an overall trend in cell permissiveness ex vivo for alleles associated with slow (CCR5
32 and CCR2 64I) or rapid disease progression (RANTES intron 1.1) in vivo.
Association of genetic variants with disease progression in vivo.
Of 34 candidate alleles (i.e., SNPs and indels) identified by resequencing and tested ex vivo, 5 were thereafter assessed in the SHCS for their roles in modulating HIV-1 infection in study participants. The clinical phenotype was defined as the patient-specific rate of CD4 T-cell decline, a recognized marker of disease progression (3). We used this phenotype because of limited data on AIDS-defining illness or death in this cohort, as the progression of disease was for most patients stopped with the arrival of potent ART. Analysis excluded any CD4 T-cell values after initiation of treatment. The median follow-up time was 3.1 years; the mean was 4.0 years, during which the 851 cohort participants contributed 8,231 CD4 T-cell determinations to the analysis (median, 7 CD4 T-cell determinations per participant). Square-root-transformed CD4 cell counts were modeled as a linear function of time from the estimated date of seroconversion. Overall, we identified a hierarchy of putative effects on disease progression (Fig. 2A), where the new candidate markers exhibited effects on CD4 T-cell depletion that were comparable to a number of previously reported gene variants. Among the previously reported markers investigated, some were not informative in this data set (IL-10 variants and RANTES 403G>A). The rare CCR5 coding region 303T>A was not present in the study population. The SDF1 3'A genotype was associated with an accelerated progression, in agreement with most of the publications reporting on this variant (2, 20, 33, 47). In addition, other variants were at odds with some of the available literature (RANTES intron 1.1, IL-4 589T, and CX3CR1 280M). This reflects the ongoing controversy surrounding the net contribution of some of these variants to disease progression (44), in particular for IL-4 589C>T (31, 35, 48) and CX3CR1 280M (8, 9, 21, 24, 29). Differences between the present study and previous reports may also reflect the choice of analysis that uses, in a seroprevalent cohort, CD4 decline rather than time to AIDS or death. In single gene analysis, the time to diminish from 500 to 200 CD4 T cells was estimated to increase from 3.7 to 4.3 years (P = 0.03) for CCR5
32 carriers and from 3.6 to 4.2 (P = 0.03) for CCR2 64I heterozygous carriers. It decreased from 3.9 to 3.4 years (P = 0.02) for PPIA 1650A>G carriers and from 3.9 to 3.4 (P = 0.03) for TSG101 183T>C carriers (see Table S2 in the supplemental material).
![]() View larger version (16K): [in a new window] |
FIG. 2. Association of candidate and known alleles with HIV-1 disease progression in vivo. (A) Difference in square root CD4 gradient comparing carriers of a rare allele with patients homozygous for the common allele (whiskers show 95% confidence interval). (B) Ex vivo/in vivo correlation for markers associated with differences in permissiveness or disease progression.
|
32 and CCR2 64I (lower ex vivo replication and slower disease progression), and for the new candidate PPIA promoter 1650A>G (greater ex vivo replication and faster disease progression). Strikingly, TSG101 183T>C was associated with lower CD4 T-cell permissiveness ex vivo but with a rapid loss in CD4 T cells in vivo. Extensive analysis of the role of TSG101 variants in HIV-1 disease progression will be presented in detail elsewhere.
Multiple gene effects model.
We hypothesized that inclusion of reported markers together with candidate markers of PML, PPIA, TSG10, and ßTRC would improve prediction of CD4 T-cell count decline in multiple gene models. Table 1 shows the coefficients from four models, which estimate the difference in the square-root-transformed CD4 gradient comparing patients who are homozygous or carriers of the rare allele with patients homozygous for the common allele. Model 1 estimates the combined effect of the known markers compared to the null model without genetic terms (P = 0.007), and model 2 estimates the combined effect of the candidate markers alone (P = 0.02). Model 3 includes both known and candidate markers (P = 0.001). Model 4 was constructed using a stepwise procedure to eliminate the least predictive markers, stopping at the model which minimized the P values obtained from comparing candidate models with the null model without genetic terms. The optimal model retained CCR5
32, CCR2 64I, CCR5 59029AA, MIP1
495TT, SDF1 3'A, PML 225TT, PPIA 1650G, and TSG101 183C (P = 0.00005). The model is statistically optimized given the available data. There might be allelic variants with large effects that are excluded from the optimal model because their standard errors are large due to a small allelic frequency in the population.
|
View this table: [in a new window] |
TABLE 1. Model coefficients showing the difference in the square-root-transformed CD4 T-cell gradient comparing patients who are homozygous or carriers of rare alleles with patients who are homozygous for the common allelesa
|
In sensitivity analyses, we assessed a model limited to using the first six measurements for each patient. This approach suggested that CCR5 59029AA, CCR5
32/CCR2 64I, MIP1
459TT, RANTES 28G, and the candidate allele PPIA 1604GG have substantially greater effects when the measurements are restricted to earlier times in the disease process. These estimates are consistent with temporal variation in the effect alleles have on the course of infection, as previously reported for variants of the chemokine receptors CCR5 and CCR2 (23, 32, 51). The model did not include data on HLA typing, a major genetic influence (4), the recently described gene dose effect of CCL3L1 (17), or an extensive analysis of the CCR5 region to better define the contribution of CCR5 59029G>A in the context of the haplotypic structure of the gene cluster (15). Cross-validation with other cohorts remains mandatory before the proposed alleles of PML, TSG101, and PPIA can be defined as true host genetic variants influencing HIV-1 infection.
Only 10% of patients included in the study had known dates of seroconversion; thus, we estimated seroconversion dates (14). The effects of known influences (i.e., CCR5 and CCR2 alleles) found in this patient population support the validity of the effects estimated for other allelic variants. The study population investigated contributed substantial data on CD4 counts but insufficient data on viral load and limited data on AIDS-defining illness or death, as the progression of disease was for most patients controlled with the availability of potent ART. Analysis excluded any values after the initiation of treatment. We believe it is a strength of this work to define disease progression by CD4 loss rather than by AIDS or death, as this is an approach that can be proposed to other seroprevalent cohorts where disease progression is halted by treatment.
Conclusions. We are aware that an ex vivo system of population genetics will not capture genes linked to mechanisms of pathogenesis that are not relevant to a CD4-only model, in particular, those involving immunogenetic determinants, or other cell types involved in the pathogenesis of AIDS. This system also represents a reductionist approach to a highly complex process involving host and viral factors. However, we reasoned that the analysis of a simplified, more homogenous surrogate assay might overcome some of the obstacles in the analysis of complex traits by providing a method to single out potential candidate genes which could then be tested for their role in HIV-1 disease in vivo. The population included 128 donors, approximately 1 log less than the number of participants probably needed in association studies in vivo (22), consistent with the speculation that a standardized, simplified system may magnify the consequences of specific alleles. Association studies do not define causality between certain SNPs and clinical/biological outcomes. The functional relevance of a marker allele needs biological plausibility and validation. The ex vivo system can be used for functional analyses such as the detailed analysis of cellular restriction blocks (6) and analysis of mRNA expression or splicing of selected genes (10). Alternatively, the ex vivo system can also be seen as an independent confirmation of in vivo findings, thus increasing the likelihood of identifying true genetic associations.
Overall, from eight previously reported genes associated with disease progression, the ex vivo/in vivo data were in agreement for two genes, one gene was identified only in vivo, and one gene that could not be evaluated ex vivo was associated with an effect in vivo. The data were in agreement with the literature for three genes, and one remains controversial among several publications. In particular, there was a correlation between the ex vivo and in vivo phenotype for well-established genetic variants of CCR5 and CCR2. For nine candidate genes, five were dropped during the ex vivo assessment, one gene presented consistent data ex vivo/in vivo, while two were retained by both approaches; however, the attributed effect was of contrary sign between the two testing strategies. The paradoxical nature of the TSG101 and PML allele effects (less ex vivo viral production and faster progression in vivo) would imply mechanisms of pathogenesis for which there is no biological support at this time. The sequential approach for the selection of markers cannot exclude the possibility that valid markers were excluded that would have been identified in vivo. In the absence of a gold standard, and in the context of the complexity of making genetic associations on solid ground, a sequential strategy for screening ex vivo and validating in vivo cannot be presented as satisfactory or as suboptimal; rather, it is a potential approach to examining host genes participating in the viral life cycle and therefore intuitively amenable to testing in a cellular system. These are critical issues in the field of complex trait genetics where only 30% of reported associations can be considered proven (22, 45).
We thank Mary Carrington for useful commentary.
The members of the Swiss HIV Cohort Study are S. Bachmann, M. Battegay, E. Bernasconi, H. Bucher, P. Bürgisser, M. Egger, P. Erb, W. Fierz, M. Fischer, M. Flepp (Chairman of the Clinical and Laboratory Committee), P. Francioli (President of the SHCS, Centre Hospitalier Universitaire Vaudois, Lausanne, Switzerland), H. J. Furrer, M. Gorgievski, H. Günthard, P. Grob, B. Hirschel, L. Kaiser, C. Kind, T. Klimkait, B. Ledergerber, U. Lauper, M. Opravil, F. Paccaud, G. Pantaleo, L. Perrin, J.-C. Piffaretti, M. Rickenbach (Head of Data Center), C. Rudin (Chairman of the Mother and Child Substudy), J. Schupbach, R. Speck, A. Telenti, A. Trkola, P. Vernazza (Chairman of the Scientific Board), R. Weber, and S. Yerly.
Supplemental material for this article may be found at http://jvi.asm.org/. ![]()
G.B. and M.M. contributed equally to this work. ![]()
A.T. is a member of the Swiss HIV Cohort Study. ![]()
|
|
|---|
This article has been cited by other articles:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2009 by the American Society for Microbiology. For an alternate route to Journals.ASM.org, visit: http://intl-journals.asm.org | More Info»