Purpose: Cell-based approaches were used to identify genetic markers predictive of patients' risk for poor response prior to chemotherapy.
Experimental Design: We conducted genome-wide association studies (GWAS) to identify single-nucleotide polymorphisms (SNP) associated with cellular sensitivity to carboplatin through their effects on mRNA expression using International HapMap lymphoblastoid cell lines (LCL) and replicated them in additional LCLs. SNPs passing both stages of the cell-based study were tested for association with progression-free survival (PFS) in patients. Phase 1 validation was based on 377 ovarian cancer patients receiving at least four cycles of carboplatin and paclitaxel from the Australian Ovarian Cancer Study (AOCS). Positive associations were then assessed in phase 2 validation analysis of 1,326 patients from the Ovarian Cancer Association Consortium and The Cancer Genome Atlas.
Results: In the initial GWAS, 342 SNPs were associated with carboplatin-induced cytotoxicity, of which 18 unique SNPs were retained after assessing their association with gene expression. One SNP (rs1649942) was replicated in an independent LCL set (Bonferroni adjusted P < 0.05). It was found to be significantly associated with decreased PFS in phase 1 AOCS patients (Pper-allele = 2 × 10−2), with a stronger effect in the subset of women with optimally debulked tumors (Pper-allele = 4 × 10−3). rs1649942 was also associated with poorer overall survival in women with optimally debulked tumors (Pper-allele = 9 × 10−3). However, this SNP was not significant in phase 2 validation analysis with patients from numerous cohorts.
Conclusion: This study shows the potential of cell-based, genome-wide approaches to identify germline predictors of treatment outcome and highlights the need for extensive validation in patients to assess their clinical effect. Clin Cancer Res; 17(16); 5490–500. ©2011 AACR.
One of the greatest challenges in anticancer agents pharmacogenomic markers discovery using a whole-genome approach is identifying a relevant system. Although humans are the most relevant system for study, the whole-genome approach requires a large number of well-phenotyped patients treated with the same dosage regimen of a single chemotherapeutic drug. The studies are extremely expensive and require years to accrue for an adequately powered study. To this end, we have developed an in vitro model system that overcomes these challenges. Genome-wide germline genetic marker discovery and replication were conducted in these cell-based models and findings were validated in clinical settings. This study shows not only the potential of this cell-based genome-wide approach to identify clinically important germline predictors of outcome following chemotherapy but also the need for extensive validation in clinical samples.
Ovarian cancer is the fifth leading cause of cancer mortality among women (1). Treatment of advanced disease consists of a platinum agent (usually carboplatin) and a taxane (usually paclitaxel) following cytoreductive surgery (2). Despite the high initial response rate to this chemotherapy, a proportion of cancers are intrinsically resistant to therapy (3) and susceptibility to side effects is variable, with some patients developing severe carboplatin-induced myelosuppression (4). Clinically useful predictors that identify individuals most likely to benefit from carboplatin, or for that matter most chemotherapy, are lacking. Hence, identifying patients prior to treatment who are less likely to benefit from or most likely to experience adverse events from chemotherapeutic agents is essential.
The most relevant system for pharmacogenomic discovery in oncology are humans; however, executing pharmacogenomic clinical trials with enough power to detect true genetic signals in the presence of multiple confounding factors such as concomitant medications, dosage, and diet is extremely costly and difficult. Therefore, cell-based models evaluating gene expression, genetic polymorphisms, and/or other biomarkers have been developed to help predict chemotherapy-induced response and toxicity (5). One such model uses International HapMap lymphoblastoid cell lines (LCL) that have extensive and publicly available genotypic information, enabling genome-wide association studies (GWAS) that identify, in an unbiased fashion, genotype–phenotype relationships (5). The advantage of using LCLs in pharmacogenomics discovery is that they can be grown under identical conditions, allowing the phenotype to be tested in a well-controlled, isolated system without many of the confounders found in vivo. Most important, HapMap LCLs have publicly available genotypic data and are now part of the 1000 Genomes Project (6). We developed using International HapMap LCLs a genome-wide model (referred to as the “triangle model”) that integrates genotype, gene expression, and in vitro cytotoxicity data to identify genetic polymorphisms associated with cellular sensitivity to chemotherapeutics (7–10). The successful clinical validation of these cell-based model findings was recently reported in a small head and neck cancer trial (11).
The goal of the present study was to use this cell model to discover genetic variants associated with cellular sensitivity to carboplatin that could be tested in a large cohort of clinical samples from patients treated with carboplatin. We hypothesized that genetic variants identified in our cell-based model would have utility in identifying patients treated with carboplatin with different clinical outcomes.
Materials and Methods
Genome-wide approach to identify genetic polymorphisms that are associated with carboplatin sensitivity
EBV-transformed LCLs derived from 30 Centre d' Etude du Polymorphisme Humain (CEPH) trios from Utah residents with ancestry from northern and western Europe (HAPMAPPT01, CEU) along with 52 unrelated CEPH LCLs (8) were purchased from the Coriell Institute for Medical Research (Camden, NJ). Cell growth inhibition was evaluated at concentrations of 0, 10, 20, 40, and 80 μmol/L carboplatin for 72 hours and reported previously (11, 12). The concentration required to inhibit 50% cellular growth (IC50) was determined for each LCL through curve fitting and used as an indicator of carboplatin sensitivity.
The genome-wide approach that incorporates geno- me-wide single-nucleotide polymorphism (SNP), gene expression, and carboplatin IC50 to identify genetic predictors of platinum sensitivity was described previously (7–9, 11). Briefly, SNP genotypes from the CEU population were downloaded from the International HapMap database (http://www.HapMap.org, release 22). A total of 2,286,186 SNPs with minor allele frequency (MAF) of more than 5% and no Mendelian inheritance transmission errors in the CEU trios were used. IC50 values were log2 transformed to obtain approximate normally distributed phenotypes. The quantitative transmission disequilibrium test (QTDT) was conducted to identify any genotype–cytotoxicity associations by using the QTDT software (http://www.sph.umich.edu/csg/abecasis/QTDT/; ref. 13) with sex as a covariate. P ≤ 1 × 10−4 was used to select SNPs to carry forward in the analysis. The linkage disequilibrium (LD) patterns at selected SNPs within each population were evaluated using Haploview version 3.32 (http://www.broad.mit.edu/mpg/haploview/). To detect evidence of recent positive selection in the genomic regions of interest, we employed the Haplotter online tool (http://haplotter.uchicago.edu/) developed by the Pritchard group to compute the integrated haplotype score (iHS; ref. 14). The iHS quantifies the amount of extended haplotype homozygosity at a locus along the ancestral allele background relative to the derived allele background. Because iHS is standardized with mean = 0 and variance = 1, a positive iHS score greater than 2 means that haplotypes on the ancestral allele background are longer than those of the derived allele background.
Baseline gene expression was evaluated in 87 CEU LCLs by using the Affymetrix GeneChip Human Exon 1.0 ST array (Exon Array), as previously described (10). The gene expression data described in this study have been deposited into GEO (GenBank accession no: GSE7761). Genes were evaluated as “transcript clusters,” each of which refers to a cluster of one or more exons covering a genic region. Transcript cluster expression summarizes all exonic transcriptional evidence for a known or putative gene. SNPs derived from the genotype–IC50 association analysis in CEU were tested for their association with gene expression, using the QTDT test with gender as a covariate as described previously (9). A Bonferroni correction (Pc < 5 × 10−2) using the number of expressed genes tested was used to adjust for multiple testing.
To examine the relationship between gene expression and sensitivity to carboplatin, a general linear model was constructed with log2-transformed carboplatin IC50 as the dependent variable and transformed gene expression level together with an indicator for gender as the independent variables (9). If an SNP was significantly associated with carboplatin IC50 and the same SNP was significantly associated with gene expression, then the aforementioned approach was used to test whether gene expression significantly predicted IC50. Transcript cluster expression, with gender as a covariate, was tested as a predictor of carboplatin sensitivity in the CEU population. P < 5 × 10−2 was considered statistically significant. A target gene in this analysis was defined as one whose expression was associated with 1 or more SNP genotypes and whose expression significantly correlated with carboplatin IC50.
Genotype carboplatin sensitivity analysis on replication sample set
National Cancer Institute (NCI) and National Human Genome Research Institute (NHGRI) have jointly published a set of guidelines for designing a replication study following GWAS (15). Our study sought to adhere closely to these guidelines. A set of 52 unrelated LCLs that are part of the same genetic ancestry (CEPH/Utah) but are not part of HAPMAPPT01 LCLs that were used in the discovery set was evaluated for carboplatin sensitivity. Phenotypic data were log2 transformed prior to association analysis, as was done for the discovery set. Eighteen SNPs identified in the discovery step [after removing redundant SNPs with LD r2 ≥ 0.8] were genotyped using Sequenom MassARRAY iPLEX platform. Genotype–phenotype association evaluation was conducted using linear regression. Additive genetic effects were assumed for these associations. Bonferroni adjusted P < 5 × 10−2 was considered statistically significant.
Validation of selected SNPs in clinical samples
SNPs found to be significantly associated with carboplatin sensitivity in vitro were evaluated in invasive ovarian cancer patients receiving primary chemotherapy treatment of a minimum of 4 cycles of paclitaxel (175 or 135 mg/m2) and carboplatin [area under the curve (AUC) = 5 or 6] at 3 weekly intervals, using a 2-phase validation approach. In phase 1, we analyzed genotype data from 377 non-Hispanic white patients from the Australian Ovarian Cancer Study (AOCS), both overall and according to debulking status (optimally debulked patients ≤1 cm residual disease, and suboptimally debulked patients >1 cm residual disease). In phase 2, we further evaluated positive findings in 1,326 non-Hispanic white invasive ovarian cancer patients of all histologies and morphologies receiving primary chemotherapy for a minimum of 4 cycles of paclitaxel and carboplatin known, or presumed, to have had the standard dosages as given to the patients in phase 1. Phase 2 patients were derived from 6 studies participating in the Ovarian Cancer Association Consortium (OCAC), plus additional AOCS patients not included in phase 1 validation analysis, and The Cancer Genome Atlas (TCGA); ethnicity was self-reported. Details of study design and patient ascertainment have been previously described elsewhere (16, 17) and are summarized in Supplementary Table S1. Clinical characteristics of patients participating in phase 1 and 2 validation studies are summarized in Supplementary Table S2. In AOCS, BEL (Belgium Ovarian Cancer Study), LAX (Women's Cancer Research Institute–Cedars-Sinai Medical Center), and RPX (Roswell Park Cancer Institute Cases), progression-free survival (PFS) was defined as the time interval between the date of histologic diagnosis and the first confirmed sign of disease recurrence, or progression, based on Response Evaluation Criteria in Solid Tumors (RECIST), modified for ovarian cancer as defined by the Gynecologic Cancer InterGroup (GCIG), as previously described (18, 19). For SRO (Scottish Randomised Trial in Ovarian Cancer), progression was defined as (i) a 25% or greater increase in the size of at least 1 bi- or unidimensionally measurable lesion, (ii) a clear worsening from previous assessment of any evaluable disease (note that worsening of existing nonevaluable disease did not constitute progression), (iii) the reappearance of any lesion that had disappeared, with the exception of ascitic or pleural fluid that was drained and recurred within 3 months of drainage, or (iv) the appearance of any new lesion and/or site. For MAL (Danish Malignant Ovarian Tumor Study), progression was determined by ultrasonography or defined by cancer antigen 125 (CA125) values (increase in CA125 to >35 U/mL from a CA125 value lower than 35 U/mL after the primary treatment). For MAY (Mayo Clinic Ovarian Cancer Case Control Study), progression was defined as radiographic evidence of recurrence or initiation of second-line therapy. No consistent definition of progression was used in TCGA. All studies participating in phase 2 validation were compared at baseline to assess differences in median PFS across sites (Supplementary Fig. S1). Overall survival was the interval between the date of diagnosis and death from any cause. All studies have received approval from their respective human research ethics committees, and all OCAC participants provided written informed consent. Details of TCGA can be found at http://cancergenome.nih.gov/.
DNA extraction, genotyping methods, and quality assurance for all samples available for genotyping have been previously described (18). Genotype data for TCGA patients were downloaded through the TCGA data portal and assessed for ancestral outliers; patients of European descent were included in phase 2 analyses. SNPs found to be significantly associated with carboplatin sensitivity were further evaluated by inferring the missing genotypes in TCGA samples with the reference of the 1000 Genomes Data (1000G 2010-06 release, CEU). We carried out the standard 2-stage imputation in MACH 1.0 (20). Good imputation quality was attained by applying the following quality controls: imputed R2 > 0.3; MAF > 1%; and P value from the Hardy–Weinberg Equilibrium test < 1 × 10−6.
The primary test for association was the relationship between SNP genotypes and PFS and overall survival (OS). The Kaplan–Meier product limit method was used to estimate and plot the PFS and OS probabilities. Cox proportional hazards models were used to obtain hazard ratios (HRs) and their 95% confidence intervals (CIs) adjusted for the effects of International Federation of Gynecology and Obstetrics (FIGO) stage and residual disease (nil, ≤1 cm, >1 and ≤2 cm, >2 cm). Assuming log additive effects, the risks associated with each additional minor allele were estimated by fitting the number of rare alleles carried as a continuous covariate. Risks associated with heterozygosity and homozygosity for the minor allele of SNPs associated with outcome were also estimated. To account for differences in coding of residual disease and tumor characteristics across different studies in phase 2 analyses, estimates were additionally adjusted for tumor histology and grade, and residual disease was fitted as a dichotomous covariate (nil vs. any). Also, to allow for variation in time from diagnosis to study entry across phase 2 studies, PFS and OS data were left truncated, with time at risk starting on date of diagnosis and time under observation beginning at the time of study entry. In addition, OS data were right censored at 5 years postdiagnosis to reduce the number of deaths unrelated to ovarian cancer. Summary per-allele estimates for PFS and OS for all phase 2 validation studies were obtained using a weighted meta-analysis of site-specific loge HR. All tests for association were 2-tailed, statistical significance was assessed at the conventional level of P < 5 × 10−2, and analyses were done in STATA SE v. 11 (Stata Corp.) and the R project for Statistical Computing.
Target gene expression evaluation in NCI-60 data sets
To explore the potential mechanism of action for our identified and validated genetic variation(s), we examined the degree of correlation between the target gene expression and cellular (using both LCLs and NCI60 tumor cells) susceptibility to carboplatin. We downloaded the NCI-60 microarray expression and GI50 data sets (released in March 2007) from the DTP/NCI Molecular Target Databases (21, 22). These data sets are composed of gene expression data on untreated NCI-60 cell lines using different microarray platforms along with GI50 data. Linear regression was done between the expression of a gene of interest and log10 carboplatin GI50 in all 60 tumor cell lines as well as in a subset of ovarian cancer cell lines (n = 7). P < 5 × 10−2 was considered statistically significant.
Phenotyping the discovery and replication samples
Cellular growth inhibition was evaluated in 87 HapMap phase II CEU LCLs (discovery) and an independent set of 52 CEPH LCLs (replication). The average log2-transformed carboplatin IC50 values were not significantly different between these 2 sets of samples (4.49 and 4.60 μmol/L for discovery and replication samples, respectively; P = 0.46). The discovery set results have been described previously (12).
Genome-wide approach to identify genetic polymorphisms that are associated with carboplatin sensitivity
The overview of study workflow and the number of findings from each analysis is shown in Table 1. We identified a total of 342 SNPs that were strongly associated with carboplatin IC50 phenotype at P ≤ 10−4 in the LCL discovery samples (11). All 342 SNPs are listed in Supplementary Table S3. A binomial test showed a significant difference between our findings and random expectation (P < 10−5), suggesting that our association test results are not likely to be an artifact.
SNPs found to be associated with carboplatin sensitivity (n = 342) were further evaluated for their functional relevance, using 13,314 transcript clusters (representing 10,830 genes) expression. These analyses narrowed our candidate SNP list to 31 SNPs (Bonferroni corrected Pc < 5 × 10−2, based on the number of transcript clusters) associated with the expression of 29 different genes. After removing SNPs in high LD (r2 ≥ 0.8), our SNP list was further reduced to 18 unique signals. The final analysis examined the correlation between the expression of these “target genes” (defined as genes whose expression was associated with carboplatin sensitivity–related SNPs) and carboplatin IC50 phenotype by using a general linear model (7). The expression of 24 transcript clusters (representing 26 genes) was correlated to carboplatin IC50 at P < 5 × 10−2. The P values were used not to assess significance but as a tool to filter SNPs that show at least moderate evidence of being functional by being correlated to genes whose expression relates to drug sensitivity (11).
Evaluation of SNPs in a replication set
We successfully genotyped 17 of the 18 SNPs in the replication set. One SNP (rs1649942) was significantly associated with carboplatin sensitivity (Bonferroni adjusted P < 5 × 10−2; Fig. 1). The next SNP ordered by strength of association was rs12053210 (in complete LD with rs12614692) with unadjusted P = 1 × 10−2 and false discovery rate (FDR) = 9 × 10−2. This is not significant at the Bonferroni corrected 0.05 level, and the effect in the replication set was opposite of that found in the discovery set (Supplementary Table S4).
SNP rs1649942 was previously reported to be associated with cisplatin IC50 in HapMap CEU discovery samples (ref. 10; Supplementary Fig. S2A) and was found to be associated with cisplatin IC50 in the replication samples (Supplementary Fig. S2B). This SNP is located within the intron of NRG3 (Fig. 1A) and has been previously shown to be a master regulator associated with the expression level of 39 genes at a value of P ≤ 1 × 10−4 (23). In this study, using a more stringent cutoff (Bonferroni adjusted P < 5 × 10−2), we found the expression levels of 18 genes associated with this SNP genotype. Of them, 10 are associated with carboplatin IC50, with the gene expression levels of 7 negatively correlated with carboplatin IC50 (ALDH2, CRIM1, KYNU, LOC100131869, OAS1, RAPGEF5, and SLC2A5), suggesting the higher the gene expression, the greater cellular sensitivity to carboplatin. For the remaining 3 genes (BCR, PSTPIP2, and SHFM3P1), higher expression is associated with cellular resistance to carboplatin. In particular, we observed that the SNP was associated with the expression of ALDH2 and KYNU at a stringent Bonferroni threshold. Furthermore, the SNP ranked in the top 2% and 5% of all expression quantitative trait loci (eQTL) for ALDH2 and KYNU, respectively, at expression P value < 10−4. Importantly, a genome-wide scan (using a publicly available genomic resource we created; http://www.scandb.org) revealed that no eQTL more predictive of expression for either gene showed an association with carboplatin IC50. In addition, rs1649942 has an integrated haplotype score (iHS) of 2.8, which indicates a shorter derived allele haplotype at the SNP locus, suggesting functional significance and recent positive selection in Caucasians.
Examination of genetic variants in ovarian cancer clinical samples
We genotyped rs1649942 in a 2-phase validation analysis of 377 non-Hispanic white AOCS (phase 1) and 1,326 non-Hispanic white patients (phase 2) from 6 OCAC sites, TCGA, and an additional 154 patients from the AOCS (see Supplementary Table S2). All genotype data conformed to Hardy–Weinberg proportions (PHWE ≥ 0.1), and the MAF for rs1649942 was 0.24 (site-specific range = 0.19–0.27). In phase 1 analysis, we observed a significant decrease in PFS associated with each additional copy of the minor allele of the rs1649942 SNP [adjusted HRper-allele = 1.25 (95% CI: 1.03–1.52), P = 2.3 × 10−2]. This association with PFS was even more pronounced in a subset of women with optimally debulked tumors (residual disease ≤1 cm) [adjusted HRper-allele = 1.43 (95% CI: 1.12–1.81), P = 4 × 10−3]. Analysis of OS using right censoring at 5 years postdiagnosis to account for deaths unrelated to ovarian cancer also showed a significant association with this SNP in optimally debulked patients [adjusted HRper-allele = 1.48 (95% CI: 1.10–2.0), P = 9 × 10−3; Table 2 and Fig. 2)]. In phase 2 validation, baseline median PFS was significantly different across sites (Plog-rank = 7 × 10−4) ranging from 15 to 30 months (see Supplementary Fig. S1), which we accounted for in summary estimates using the weighted meta-analysis approach. However, in phase 2 analyses of an additional 1,326 non-Hispanic white patients from multiple sites, neither the site-specific estimates nor the summary estimates from weighted meta-analysis supported the associations observed between PFS or OS and the rs1649942 SNP in phase 1 analysis (see Supplementary Fig. S3). In light of the null findings from phase 2 validation analysis, we reanalyzed phase 1 AOCS data using the same analytical methods as for phase 2 and observed a similarly significant association for PFS in all patients [adjusted HRper-allele = 1.22 (95% CI: 0.98–1.52), P = 7 × 10−2] and in patients with no residual disease [adjusted HRper-allele = 1.81 (95% CI: 1.16–2.81), P = 9 × 10−3] but no significant association for OS in either all patients or the subset with no residual disease (adjusted Pper-allele > 9 × 10−2) was observed. When we restricted phase 2 analysis to patients known to have had the standard doses of paclitaxel and carboplatin (n = 776), we found no supporting evidence for the associations observed in phase 1. Likewise, analysis of all available patient data from the OCAC validation sites regardless of chemotherapy regimen (n = 3,190) yielded no additional support for phase 1 associations.
Relationship between rs1649942 and target gene expression
To evaluate the potential functional significance of this SNP, we examined the relationship between the SNP and its target gene expression in LCLs and NCI-60 cell lines. In the discovery set of LCLs, we found a significant association between the clinically validated SNP (rs1649942) and the baseline expression of 18 genes including ALDH2 (Fig. 3A and Supplementary Table S4) and KYNU. In addition, 10 of the 18 gene expression traits are correlated with carboplatin IC50 (Supplementary Table S4). Figure 3B illustrates that increasing the expression of ALDH2, a target gene, confers greater carboplatin sensitivity in LCLs. Incidentally, ALDH2 expression also showed a borderline significant correlation with cisplatin sensitivity (IC50) in LCLs (P = 8 × 10−2, Supplementary Fig. S2C). In a set of 7 ovarian cancer cell lines as part of NCI-60 cancer cell line resource, using the GC180405 microarray_U133 array data, we found a borderline significant association between ALDH2 expression and log10GI50 values of carboplatin (P = 8 × 10−2; Fig. 3C) and cisplatin (P = 5 × 10−2; Supplementary Fig. S2D). This is in agreement with our LCL findings that higher gene expression confers greater platinum sensitivity. In addition, another gene (KYNU), whose expression is associated with rs1649942 genotype, was recently reported as one of the genes within an expression signature that predicted OS in ovarian cancer patients receiving platinum-based chemotherapy (24).
In this study, we used LCLs from the well-genotyped International HapMap collection and identified 18 unique SNPs that associate with carboplatin sensitivity from more than 2 million SNPs. One of these was replicated in a set of independent LCL samples (Bonferroni corrected Pc < 5 × 10−2). The SNP of interest (rs1649942) shows a value of r2 of 0.20 or 0.23 with carboplatin IC50 in CEU discovery and validation samples, respectively, suggesting that this SNP explains about 20% of the phenotypic variation. We found that this SNP is associated with PFS and OS in phase I analysis of 377 Australian ovarian cancer patients who received at least 4 cycles of carboplatin-based chemotherapy. However, in a larger, second phase of evaluation of patient samples, we did not replicate these findings. The potential mechanism of action this SNP in LCLs may be through its association with the expression of 18 target genes (e.g., ALDH2 and KYNU, using a stringent Bonferroni cutoff). Ten of these target gene expression traits are also correlated with carboplatin sensitivity in LCLs.
There is a pressing need to identify germline variation that predicts response to standard therapy for advanced ovarian cancer (platinum plus taxane) because the 5-year survival rate is approximately 45%. In fact, ovarian cancer kills approximately 15,000 women in the United States every year and more than 140,000 women worldwide (25). Thus, identifying those at risk for nonresponse to certain chemotherapy allows for the possibility of administering alternative chemotherapy and potentially improving treatment outcomes.
An alternative approach to “personalized medicine” is to identify a set of gene expression signatures instead of genetic variants. In fact, a 14-gene expression predictive model was developed to predict early relapse in women with advanced ovarian cancer and treated with platinum–paclitaxel (26). However, evaluating gene expression in tumors of patients is cumbersome, variable, and expensive. A candidate gene approach has also been attempted to identify genetic markers that predict ovarian cancer treatment outcomes but failed to produce unequivocal results (27). GWAS provides an unbiased approach to evaluate all genetic variation in the genome that may contribute to disease risks (28–30) and/or drug response (31). Therefore, we employed GWAS in a cell-based model to identify germline variants with clinical applicability. The in vitro model system could be applied to other toxic drugs that would be difficult, if not impossible, to study in nondiseased patients. GWAS identified SNPs in this study that would not have been likely “candidate SNPs” based on the pharmacokinetics, pharmacodynamics, or mechanism of action of the drug.
In LCLs, the rs1649942 SNP is within the neuregulin 3 (NRG3) gene, which has been shown to activate the tyrosine phosphorylation of its cognate receptor, ERBB4, and is thought to influence neuroblast proliferation, migration, and differentiation by signaling through ERBB4 (32, 33). SNPs within the NRG3 gene have been implicated in heart failure mortality (34), schizophrenia (35); and attention-deficit/hyperactivity disorder (ADHD; ref. 36). However, the NRG3 gene itself was not well represented using the exon array. In efforts to interrogate the genomic region more closely, we used whole-genome sequence data from the 1000 Genomes Project and identified 4 additional SNPs in moderate LD (r2 > 0.70) with our SNP in CEU; none showed more significant association with carboplatin IC50. Furthermore, we explored the possibility that the SNP distantly regulates other genes in the genome to achieve its effect. Indeed, rs1649942 genotype is strongly associated with more than 10 transcriptional expression traits, suggesting that it may be a genomic master regulator (23). A simple base pair change at this locus may produce a cascade of expression signal changes, resulting in phenotypic variation (in our case, patient survival post–carboplatin treatment).
Interestingly, we found both the SNP (rs1649942) and one of its target genes (ALDH2, a mitochondrial isoform of aldehyde dehydrogenase) were also associated with sensitivity to cisplatin, another commonly used platinating agent, in our LCL model (10). A recent report showed the higher expression of ALDH1, a cytosolic isoform of the aldehyde dehydrogenase family, is associated with higher response to chemotherapy, longer disease-free survival, and OS time in ovarian cancers (37). In agreement, we found that higher ALDH2 expression was significantly correlated to sensitivity to platinum-induced cytotoxicity in both LCLs and ovarian cancer cell lines. ALDH1 was also not identified in our model because of the lack of expression of this gene in HapMap CEU samples.
Despite its many advantages over other approaches, GWAS may suffer from a high FDR. Therefore, NCI and NHGRI have jointly published a set of guidelines suggested to be used in designing a replication study following GWAS (15). Our study sought to adhere closely to these guidelines and encompassed not only an independent set of in vitro replication samples but also in vivo clinical samples for validation. In our phase 1 in vivo study using 377 AOCS patients, the risk allele of rs1649942 was associated with a modest increased risk of disease progression and death following carboplatin-based chemotherapy, with an even greater genetic contribution for both PFS and OS among a subset of patients with optimally debulked tumors. The reason for the greater effect in this subset is not entirely clear, but this result mirrors our previous observation that an association between PFS and the ABCB1 2677G>T/A SNP was only seen in women with minimal residual disease (18). Because clinical outcomes obtained from optimally debulked patients may represent the ideal scenario in which to isolate effects due primarily to chemotherapy from the confounders associated with residual disease, the effect of rs1649942 in these particular patients is of interest, but it should be noted that this result was based on small numbers of patients. There were no significant associations observed between rs1649942 genotype and factors related to prognosis in ovarian cancer including patient age, stage, histology, and residual disease, suggesting that the observed genetic effect on patient survival is likely to be related to its effect on chemotherapeutic response rather than to disease characteristics.
We did not replicate phase 1 findings in our phase 2 analysis, which differed in phase 1 analyses in several ways. In phase 2, we categorized residual disease as “nil” versus “any,” as opposed to ≤1 or >1 cm, so that we could include patients from sites that did not use this coding and adjusted for grade and histology (in addition to stage that we used for the adjusted analyses in phase 1); we also included patients (n = 550) presumed, rather than known, to have had standard doses of paclitaxel (175 or 135 mg/m2) and carboplatin (AUC = 5 or 6) to increase our power. However, when we reanalyzed phase 1 data using the same analytical method as for phase 2, we obtained similar significant associations with rs1649942. When we restricted phase 2 analysis to patients with known doses as in phase 1 (n = 776), we still found no association with this SNP. Although phase 1 estimates were based on small numbers and may be false discovery, it is possible that failure to observe an association with the rs1649942 SNP in phase 2 analysis may reflect differences in clinical definitions across studies that cannot be adequately accounted for in the analysis and low power to detect an association in the 776 patients whose treatment details were known. For example, the criteria used to define disease progression varied across studies and in some cohorts (TCGA) no consistent definition of progression was used. Time to progression was the clinical outcome measure used in this study, as the measurement of “response” to primary chemotherapy in ovarian cancer is confounded by the fact that chemotherapy is combined with debulking surgery. A fall in CA125 level cannot distinguish between the effects of chemotherapy and the effects of surgery, and imaging can be used only to assess response in patients with measurable disease remaining at the end of surgery (i.e., not in optimally debulked patients; ref. 38). The clinical validation studies used self-reported ethnicity to determine the non-Hispanic whites. Because the cell-based finding on rs1649942 is specific to Caucasians, the use of self-reported ancestry is likely to include patients with differing ethnic backgrounds and potentially mask the Caucasian-specific association. This is particularly true in the second phase validation study because patients were recruited from various sites across the world. Using ancestry informative markers to define ethnicity could potentially influence the final findings (39).
Interestingly, we have recently identified a suggestive association between this SNP and therapy-induced decreases in platelets in 60 head and neck cancer patients who underwent carboplatin-based induction therapy (11). Given this and the results of our in vitro experiments, we therefore cannot discount the possibility that this SNP may influence chemotherapy outcomes in ovarian cancer patients. Further analysis is warranted in larger, well-characterized clinical samples.
Given the obstacles to conducting large, replicable pharmacogenetic studies aimed at discovering novel variants and the clinical confounders of such, we developed cell-based models to identify genetic variants that may predict response in ovarian cancer patients. We acknowledge the limitations of using a cell-based model for pharmacogenomic discovery, but the advantages compared with attempting to carry out GWAS in a clinical trial are enormous, provided large cohorts of well-characterized patients are available for validation. Cell-based models are much less expensive, many of the environmental confounders can be controlled, and the effects of a single chemotherapeutic agent can be studied. Therefore, our cell-based approach provides a useful alternative tool aimed at identifying clinically relevant genotype–phenotype relationships through a genome-wide approach.
Disclosure of Potential Conflicts of Interest
No potential conflicts of interest were disclosed.
The Pharmacogenetics of Anticancer Agents Research (PAAR) Group (http://pharmacogenetics.org) study was supported by NIH/NIGMS grant U01GM61393 and data deposits are supported by U01GM61374 (http://pharmgkb.org/). This study was also supported by University of Chicago Breast Cancer SPORE grant P50 CA125183 (to M.E. Dolan).
R.S. Huang received support from NIH/NIGMS grant K08GM089941, NCI R21 CA139278, and University of Chicago Cancer Center Support grant (#P30 CA14599) and a Breast Cancer SPORE Career Development award.
Y.S. Fraiman was one of the awardees of the Pritzker School of Medicine Experience in Research (PSOMER) program, supported by the National Heart Lung and Blood Institute (grant NHLBI 2 T35 HL07764).
The AOCS was supported by the U.S. Army Medical Research and Materiel Command under DAMD17-01-1-0729, the National Health and Medical Research Council (NHMRC) of Australia, Cancer Council Victoria, Cancer Council Queensland, Cancer Council New South Wales, Cancer Council South Australia, The Cancer Foundation of Western Australia, and Cancer Council Tasmania. G.C. Trench is a Senior Principal Research fellow of the NHMRC, and this work was supported by NHMRC funding. Y. Li is funded by NHMRC grant 496675, and S. MacGregor is supported by an NHMRC career development award.
The Mayo Clinic study is supported by R01 CA122443, P50 CA136393. SCOTROC biological studies were supported by Cancer Research UK (grant C536/A6689).
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
The authors thank the excellent technical support provided by Steve Wisel in maintaining the cell lines. The authors also acknowledge the cooperation of all participating institutions and the contributions of the women who participated in this study. The full AOCS Study Group is available at http://www.aocstudy.org/. The results published here are in part based upon data generated by The Cancer Genome Atlas Pilot Project established by the National Cancer Institute and National Human Genome Research Institute. Information about TCGA can be found at http://cancergenome.nih.gov/.
Note: Supplementary data for this article are available at Clinical Cancer Research Online (http://clincancerres.aacrjournals.org/).
- Received March 22, 2011.
- Revision received June 9, 2011.
- Accepted June 13, 2011.
- ©2011 American Association for Cancer Research.