
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
Imaging, Diagnosis, Prognosis |
Authors' Affiliations: 1 Rush University Medical Center, Chicago, Illinois and 2 Genomic Health, Inc., Redwood City, California
Requests for reprints: Melody A. Cobleigh, Rush University Medical Center, Professional Building, 1725 West Harrison, Suite 821, Chicago, IL 60612. Phone: 312-942-2242; Fax: 312-563-3123; E-mail: melody_cobleigh{at}rush.edu.
| Abstract |
|---|
|
|
|---|
Experimental Design: Patients with
10 nodes diagnosed from 1979 to 1999 were identified. RNA was extracted from paraffin blocks, and expression of 203 candidate genes was quantified using reverse transcription-PCR (RT-PCR).
Results: Seventy-eight patients were studied. As of August 2002, 77% of patients had distant recurrence or breast cancer death. Univariate Cox analysis of clinical and immunohistochemistry variables indicated that HER2/immunohistochemistry, number of involved nodes, progesterone receptor (PR)/immunohistochemistry (% cells), and ER/immunohistochemistry (% cells) were significantly associated with distant recurrence-free survival (DRFS). Univariate Cox analysis identified 22 genes associated with DRFS. Higher expression correlated with shorter DRFS for the HER2 adaptor GRB7 and the macrophage marker CD68. Higher expression correlated with longer DRFS for tumor protein p53-binding protein 2 (TP53BP2) and the ER axis genes PR and Bcl2. Multivariate methods, including stepwise variable selection and bootstrap resampling of the Cox proportional hazards regression model, identified several genes, including TP53BP2 and Bcl2, as significant predictors of DRFS.
Conclusion: Tumor gene expression profiles of archival tissues, some more than 20 years old, provide significant information about risk of distant recurrence even among patients with 10 or more nodes.
Study of tumor molecular characteristics has enhanced our understanding of both the risk of breast cancer recurrence and the response to therapy. Assays to determine estrogen receptor (ER), progesterone receptor (PR), and HER2 receptor status are routinely done and used in breast cancer treatment planning (3, 711). Hundreds of other putative markers have been identified by immunohistochemistry or biochemical assays in studies that focused on one gene or a few genes at a time. For example, several studies found that elevated levels of urokinase-type plasminogen activator and type 1 plasminogen activator inhibitor proteins predict breast cancer recurrence (12, 13). However, few of the many putative markers have been clinically validated.
Two recent complementary advances enable a new approach for the development and clinical validation of prognostic and predictive biomarkers. First, sensitive and quantitative expression assays for thousands of genes in fresh frozen tissue using DNA microarrays (1419) can be used to identify candidate markers for validation. Although frozen tumor tissue is available for hypothesis-generating microarray studies, relatively few large banks of frozen tumor tissue exist with associated long-term recurrence and survival information. Second, reverse transcription-PCR (RT-PCR) can be used to quantify the expression of tens to hundreds of genes from a few thin sections of a formalin-fixed, paraffin-embedded tissue (2022). Thus, RT-PCR can be used to validate candidate gene expression profiles in clinical studies, as well as to report results to be used in clinical practice.
To develop a sensitive, specific, reproducible, and practical test for breast cancer prognosis, we have created and optimized a multigene gene quantitative RT-PCR expression assay (22) that measures the expression of 187 cancer-related "candidate" genes and five reference genes, typically using no more than three 10-µm formalin-fixed, paraffin-embedded sections. The represented genes were selected by surveying the research literature, including published DNA array studies, for genes implicated in cancer pathology or prognosis.
In two studies reported elsewhere, we have explored the relationship between expression of these candidate genes and recurrence in early breast cancer in independent, mostly node-negative, cohorts of breast cancer patients (23, 24). Here, we studied the correlation between tumor gene expression and distant recurrence-free survival (DRFS) in a group of patients with poor prognosis. Tabesh et al. have identified women who were operated on at Rush University Medical Center and who had
10 nodes (25). A subset of these women was found to have prolonged disease-free survival. We hypothesized that examination of this node-positive cohort would assist in identifying genes that are consistently related to the likelihood of recurrence across the range of breast cancer stage. This study contributed to culling of the original candidate genes and to the creation of a Recurrence Score algorithm that was subsequently validated in a large cooperative group trial involving tamoxifen-treated patients with node-negative breast cancer (26).
| Patients and Methods |
|---|
|
|
|---|
10 positive nodes with no evidence of metastatic disease who had surgery at Rush University Medical Center from 1979 to 1999. Of 138 eligible patients, archival fixed paraffin blocks were available from the primary tumor in 86 patients. If multiple blocks were available, the block most representative of the diagnosis was selected. A threshold of tumor cellularity (at least 5% of the area invasive cancer) was used for eligibility. Most of the sections contained extensive tumor; the proportion of tumor was >50% in 73% of the cases and <20% in 5% of the cases. This study was done after approval by the local Human Investigations Committee. Sample preparation. Three 10-µm sections were cut from each paraffin block and placed in a bar-coded microcentrifuge tube. One additional 5-µm section was cut and stained with H&E. Tubes and slides were shipped to Genomic Health, Inc. (Redwood City, CA) at ambient temperature.
Gene expression analysis. Quantitative gene expression was determined by a multianalyte Taqman RT-PCR assay that was designed to accurately measure the small fragments of RNA present in archival tumor blocks (22).
We selected candidate genes by surveying the breast cancer literature for evidence of a significant role in cancer pathologic processes, including proliferation, invasion, sensitivity to apoptosis, metastasis, angiogenesis, immune surveillance, tumor suppression activity, oncogene activity, and differentiation status. Additionally, we included a number of genes identified in published DNA microarray studies of breast cancer, primarily from two groups of investigators. Sorlie et al. (27) used unsupervised cluster analysis to identify putative breast cancer cell subclasses. Van't Veer et al. (17) used supervised learning methods to identify genes whose expression correlated positively or negatively with breast cancer recurrence.
Paraffin was removed from specimens by xylene extraction. RNA was isolated using Epicentre Technologies, Inc. (Madison, WI) MasterPure RNA Purification kit. Total RNA content was measured using the RiboGreen RNA Quantitation Kit (Invitrogen, Carlsbad, CA). Residual genomic DNA contamination was assayed by Taqman PCR assay for ß-actin DNA. Samples with contaminating DNA were resubjected to DNase I treatment and assayed again for DNA contamination.
Reverse transcription of the purified RNA was carried out using SuperScript II RT enzyme (Invitrogen) for first-strand cDNA synthesis.
Taqman reactions were carried out in 384-well plates according to manufacturer's instructions, using Applied Biosystems PRISM 7900HT Taqman instruments (Foster City, CA). A total of 192 genes (187 candidate genes and 5 reference genes) were tested initially. RNA extracts were subsequently retested for expression of 16 additional candidate genes. Expression of each gene was measured in duplicate and then normalized relative to the five reference genes (ß-actin, GAPDH, GUS, RPLPO, and TFRC; ref. 22). Reference-normalized expression measurements typically range from 0 to 15, where a 1-unit change generally reflects a 2-fold change in RNA.
Clinical assessments and outcomes. Tumor size, nodal status, and systemic treatment (hormonal and/or chemotherapy) at initial diagnosis were entered prospectively into a database begun in 1979 and were reviewed for accuracy by examining case records. The primary end point of DRFS was determined from the dates of systemic disease recurrence, death, and/or last follow-up. Cause of death was classified as due to breast cancer, due to other cause, or of uncertain cause.
Other assessments. H&E-stained slides submitted to Genomic Health were reviewed by a pathologist for confirmation of the submitting diagnosis and for assessment of the proportion of tissue surface area composed of tumor. Immunohistochemistry for ER, PR, HER2, and Ki-67/MIB-1 protein was done by the avidin-biotin peroxidase technique, using reagents from DAKO (Carpinteria, CA). The threshold for categorizing immunohistochemistry as positive was 10% of cells. Tumor grade using the Bloom-Richardson criteria was independently assessed at Rush University Medical Center and at Genomic Health (28).
Statistical methods. Demographic and baseline characteristics were summarized by descriptive statistics. Cox proportional hazards models were applied in univariate, multivariate, and stepwise analysis of DRFS as a function of clinical variables and gene expression. For the primary analysis, death due to uncertain cause was treated as death due to breast cancer, and death due to causes other than breast cancer was censored at the time of death. Because the number of candidate genes relative to the number of patients was large, we did simulations in which we randomly shuffled the patient survival times versus gene expression to estimate the number of genes that would seem significant (P < 0.01 or P < 0.05) in the absence of a genuine association with survival (false-positive rate). Correlation analyses of gene expression used Pearson linear correlation. Cluster analysis used 1-Pearson R as the distance metric and single linkage hierarchical clustering. The Recurrence Score was calculated as described by Paik et al. (26) from the expression of 21 genes: 16 cancer-related genes (Ki67, STK15, Survivin, CCNB1 or cyclin B1, MYBL2, GRB7, HER2, ER, PGR, BCL2, SCUBE2, MMP11 or stromelysin 3, CTSL2 or cathepsin L2, GSTM1, CD68, and BAG1) and five reference genes (ACTB or ß-actin, GAPDH, RPLPO, GUS, and TFRC).
| Results |
|---|
|
|
|---|
Characteristics and outcomes for the 78 evaluable patients are listed in Table 1. The mean tumor size was 4.4 ± 3.3 cm (mean ± SD). Of note, 33% had T1 tumors. The cohort had extensive nodal involvement. The median number of involved nodes was 15 (range, 10-40). Adjuvant tamoxifen was given in 54% of patients. During the 1980s, many premenopausal patients with ER-positive tumors did not receive tamoxifen. Adjuvant chemotherapy was administered in 80% of patients. Anthracycline-based regimens were used in 26 patients (five with the addition of a taxane); cyclophosphamide, methotrexate, and fluorouracilbased regimens in 29 patients; L-PAM/5-fluorouracil in 10 patients; and L-PAM alone in one case (four patients received multiple regimens). The use of diverse chemotherapeutic agents reflects the long follow-up of the cohort. The patients who did not receive chemotherapy were significantly older than the mean (only one was <60 years of age). Median follow-up was 15.1 years, and median time to event (distant recurrence or death due to breast cancer or uncertain cause) was 2.6 years.
|
|
The Kaplan-Meier plots of the time to distant recurrence for nodal status, ER status by immunohistochemistry, and HER2 status by immunohistochemistry are shown in Fig. 1A-C.
|
Univariate analysis of gene expression and distant recurrence-free survival. The relative risk of DRFS for each of the candidate genes with Punadjusted < 0.1 is given in Table 2. Four genes have a P < 0.01, and a total of 22 genes have a P < 0.05. Higher expression was associated with greater risk of distant recurrence for GRB7, CD68, CA9, and CTSL. Higher expression was associated with a lower risk of distant recurrence for, among other genes, Bcl2, TP53BP, PR, PRAME, GSTM1, ER, and BAG1.
The Kaplan-Meier plots of the time to distant recurrence for varying expression of the top two genes by RT-PCR, Bcl2 and GRB7, are shown in Fig. 1D-E.
The procedure of Benjamini and Hochberg was again applied to control for false discovery rates (29). With regard to gene expression measurements of DRFS, four genes (Bcl2, GRB7, TP53BP2, and PR) remained significant.
Simulations based on randomly shuffling DRFS times, done to determine the number of genes that would be expected to correlate with DRFS by chance, indicated that two genes would be expected to have a P < 0.01, and 10 genes have a P < 0.05.
Correlated gene expression. The assay revealed groups of coexpressed genes that were consistent with literature prediction and/or publicly available gene expression databases, indicating that the assay accurately measured the relative levels for the tested mRNAs. For example, expression of the following pairs of genes was highly correlated, as predicted by the work of Sorlie et al. (27): cytokeratin 5 and cytokeratin 17 (r = 0.86), LPL and RBP4 (r = 0.82), HER2 and GRB7 (r = 0.82). For the ER gene, two independent probe-primer sets were tested and yielded very similar values for ER mRNA level (r = 0.96).
Unsupervised cluster analysis of the genes that correlated with DRFS (P < 0.1) revealed a cluster of genes known to coexpress with ER, including Bcl2, ER, IGF1R, SCUBE2, PR, and IGFBP2; a group of proliferation-related genes, including SKT15 and Chk1; a group of HER2-related genes, including HER2 and GRB7; and a group of macrophage-related genes, including CD68 and cathepsin L2.
Immunohistochemistry and reverse transcription-PCR for estrogen receptor, progesterone receptor, HER2, and Ki-67/MIB-1. Immunohistochemistry and RT-PCR measurements of tumor expression for ER, PR, HER2, and Ki-67/MIB-1 were compared (Fig. 2A-D). The concordance between RT-PCR (mRNA) and immunohistochemistry (protein) was high for both ER (
= 0.83) and HER2 (
= 0.67) and somewhat lower for PR (
= 0.40). In contrast, the correlation between RT-PCR for Ki-67 and immunohistochemistry for Ki-67 was poor (
= 0.22). Similarly, correlation was poor between RT-PCR for other proliferation-related genes, such as PCNA and TOPO 2A, and immunohistochemistry for Ki-67.
|
|
Specifically, bootstrap resampling, based on 500 resamplings with replacement of the original data set, was done to assess the arbitrariness of the stepwise variable selection procedure. Results of this analysis identified 30 variables that occurred in at least 10% of the 500 stepwise regressions, of which 26 of these variables were gene expression measurements. Ten of these 26 genes (BAG1, Bcl2, CD68, CTSL2, GRB7, GSTM1, MYBL2, PR, Survivin, and STK15) are included in the calculation of Recurrence Score as described by Paik et al. (26). The top clinical variables identified were HER2/immunohistochemistry, adjuvant tamoxifen, number of nodes involved, and tumor size.
Analysis of the 21 genes in the recurrence score assay. Fourteen of the 16 cancer-related genes used in the Recurrence Score algorithm correlated with breast cancer recurrence, nine of them at P < 0.05 and 14 at P < 0.10. Five of the 16 genes belong to a group of proliferation genes (Pearson correlation coefficients range, 0.35-0.60); four of the 16 genes used in the algorithm belong to an ER group; the tightly linked HER2 and GRB7 genes and two invasion genes encoding extracellular matrixdegrading proteases are also represented.
Of the 78 evaluable patients in this study, 11 patients (14%) had a Recurrence Score of <18 and had a rate of distant recurrence at 10 years of 29% (95% confidence interval, 0-53%); 19 patients (24%) had a Recurrence Score between 18 and 31 and had a rate of distant recurrence at 10 years of 72% (95% confidence interval, 38-88%); and 48 patients (62%) had a Recurrence Score of
31 and had a rate of distant recurrence at 10 years of 80% (95% confidence interval, 63-89%).
| Discussion |
|---|
|
|
|---|
10 nodes. Clinical and pathologic variables, such as age, tumor size, number of involved nodes, and systemic treatment, had only modest correlation with the likelihood of recurrence. HER2 protein expression by immunohistochemistry was significantly correlated with DRFS. Tumor gene expression quantified by RT-PCR for 22 genes was significantly correlated with DRFS. The top genes were Bcl2 and TP53BP2, for which higher expression was associated with longer DRFS, and GRB7, for which higher expression was associated with shorter DRFS. Multivariate analysis of the clinical variables, pathology variables, and gene expression by RT-PCR for Bcl2 and TP53BP2 indicated that gene expression was independently associated with distant recurrence. In the univariate analysis of gene expression by RT-PCR and DRFS, 22 of the 203 candidate genes were correlated with DRFS with a P < 0.05 (not adjusted for multiple comparisons; Table 2). Because of the relatively small sample size and multiple comparisons, it is likely that a number of these "significant" correlations are false positives, and that some important genes might not be detected. However, our strategy was to perform several independent studies with a similar panel of candidate genes to identify genes that consistently correlate with recurrence over a diverse range of patients and then to take this shorter list into a large study, where the gene list and algorithm for predicting risk are prospectively defined. Of note, 14 of the 34 genes associated with DRFS as shown in Table 2 were also significantly associated with distant recurrence in two independent studies of a lower risk group of breast cancer patients (23, 24) and were included in the Recurrence Score algorithm used in the subsequent large clinical validation studies (26, 30, 31).
Of the genes in Table 2 that were correlated with DRFS in the univariate analysis (Punadjusted < 0.10), many came from four multigene expression clusters that were defined by two criteria: (a) grouped by published gene expression analysis (15, 17, 27, 32) and/or by published evidence of related biological functions, and (b) coexpression in our study with correlation coefficients of >0.5. Without exception, genes within a cluster that influenced risk did so in the same direction. Five of the genes (ER, PR, Bcl2, SCUBE2, and IGF1R) came from an "ER cluster," where higher expression was associated with longer DRFS. Five of the genes (CHK1, STK15, MYBL2, cyclin B1, and Ki-67) came from a "proliferation cluster," where higher expression was associated with shorter DRFS. Two of the genes (GRB7 and HER2, which lie within 70 kb on chromosome 17 and are frequently coamplified in breast cancer) had a correlation coefficient of coexpression of 0.82 and form a "HER2 cluster." As expected, higher expression of GRB7 and HER2 were associated with shorter DRFS. Finally, three of the genes (CD68, CTSL, and CTSL2) came from a "macrophage cluster," where higher expression was associated with shorter DRFS. This is consistent with previous reports that cancer cells stimulate macrophages to produce various growth factors, angiogenesis factors, and matrix-degrading enzymes that in turn promote tumor growth and survival (33, 34).
Of note, several tested genes that have been previously linked to breast cancer recurrence did not correlate significantly with recurrence in the present study. For example, large clinical trials have correlated urokinase-type plasminogen activator and type 1 plasminogen activator inhibitor with shortened survival (12, 13). Those studies used protein assays (by ELISA) rather than mRNA, a distinction that might explain why we did not find significant correlation with these genes. Van't Veer et al. (17), who also measured mRNA levels, also failed to detect a significant relationship between urokinase-type plasminogen activator, type 1 plasminogen activator inhibitor, and recurrence. Seventy genes were identified in that microarray study on fresh frozen tissue. Two of the 34 genes identified in this study (Table 2; SCUBE2 and BBC3) overlap with the previously reported 70-gene panel.
The accuracy and specificity of this RT-PCR assay of formalin-fixed, paraffin embedded tumor tissue was supported by comparison of the results of RT-PCR assay of RNA and immunohistochemistry assay of protein for ER, PR, and HER2. Overall, the concordance between the RT-PCR and immunohistochemistry for ER, PR, and HER2 determinations was high. In contrast, the concordance between the RT-PCR measurements and immunohistochemistry assay for Ki-67 was poor. It has been reported previously that it is difficult to standardize and control the immunohistochemistry assay for Ki-67 (35). The results of this study suggest that caution should be exercised when using the immunohistochemistry assay for Ki-67 for clinical decision making.
The data generated in this study contributed to the development of the 21-gene assay and Recurrence Score algorithm that was subsequently tested in a prospective study on the tamoxifen-treated patients in NSABP Study B-14 who were node negative and had tumors that were ER positive (26). The sample size in this study of node-positive patients was not sufficient to address the relative performance of the Recurrence Score and standard measures, such as patient age, tumor size, and tumor grade. The larger independent NSABP B-14 study was also designed to determine whether the performance of the Recurrence Score exceeded standard measures.
It is important to emphasize that because this data set was used to develop the 21-gene assay, the performance of the Recurrence Score in this population can not be taken as validation. However, as might be expected, the proportion of patients with Recurrence Score of <18 (low-risk group) was much smaller in this cohort of node-positive patients than in cohorts of node-negative patients. In addition, for any Recurrence Score category, the risk of recurrence for patients with
10 nodes was higher than for patients with negative nodes. New larger studies are needed to more accurately define the relationship between Recurrence Score and likelihood of recurrence in node-positive patients, especially for patients with one to three pathologically positive nodes or patients with nodes positive by immunohistochemistry alone.
In summary, this study found that tumor gene expression was correlated with the likelihood of distant recurrence in patients who have invasive breast cancer and
10 nodes and contributed to the development of the 21-gene Recurrence Score assay.
| Acknowledgments |
|---|
| Footnotes |
|---|
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
Note: Presented in abstract form at the 2003 Annual Meeting of American Society of Clinical Oncology.
Received 4/ 4/05; revised 8/13/05; accepted 9/ 1/05.
| References |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
A. Russom, P. Sethu, D. Irimia, M. N. Mindrinos, S. E. Calvano, I. Garcia, C. Finnerty, C. Tannahill, A. Abouhamze, J. Wilhelmy, et al. Microfluidic Leukocyte Isolation for Gene Expression Analysis in Critically Ill Hospitalized Patients Clin. Chem., May 1, 2008; 54(5): 891 - 900. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. S. Ross, C. Hatzis, W. F. Symmans, L. Pusztai, and G. N. Hortobagyi Commercialized Multigene Predictors of Clinical Outcome for Breast Cancer Oncologist, May 1, 2008; 13(5): 477 - 493. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. N. Harris, D. F. Hayes, and R. C. Bast In Reply J. Clin. Oncol., April 20, 2008; 26(12): 2060 - 2061. [Full Text] [PDF] |
||||
![]() |
L. Marchionni, R. F. Wilson, A. C. Wolff, S. Marinopoulos, G. Parmigiani, E. B. Bass, and S. N. Goodman Systematic Review: Gene Expression Profiling Assays in Early-Stage Breast Cancer Ann Intern Med, March 4, 2008; 148(5): 358 - 369. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Bai and S.-W. Luoh GRB-7 facilitates HER-2/Neu-mediated signal transduction and tumor formation Carcinogenesis, March 1, 2008; 29(3): 473 - 479. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Cronin, C. Sangli, M.-L. Liu, M. Pho, D. Dutta, A. Nguyen, J. Jeong, J. Wu, K. C. Langone, and D. Watson Analytical Validation of the Oncotype DX Genomic Diagnostic Test for Recurrence Prognosis and Therapeutic Response Prediction in Node-Negative, Estrogen Receptor-Positive Breast Cancer Clin. Chem., June 1, 2007; 53(6): 1084 - 1091. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Paik Development and Clinical Utility of a 21-Gene Recurrence Score Prognostic Assay in Patients with Early Breast Cancer Treated with Tamoxifen Oncologist, June 1, 2007; 12(6): 631 - 635. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Stafford and M. Brun Three methods for optimization of cross-laboratory and cross-platform microarray expression data Nucleic Acids Res., May 11, 2007; 35(10): e72 - e72. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. G. Hicks, B. J. Yoder, S. Short, S. Tarr, N. Prescott, J. P. Crowe, A. E. Dawson, G. T. Budd, S. Sizemore, M. Cicek, et al. Loss of Breast Cancer Metastasis Suppressor 1 Protein Expression Predicts Reduced Disease-Free Survival in Subsets of Breast Cancer Patients. Clin. Cancer Res., November 15, 2006; 12(22): 6702 - 6708. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. S. Azad, N. Rasool, C. M. Annunziata, L. Minasian, G. Whiteley, and E. C. Kohn Proteomics in Clinical Trials and Practice: Present Uses and Future Promise Mol. Cell. Proteomics, October 1, 2006; 5(10): 1819 - 1829. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| Cancer Research | Clinical Cancer Research |
| Cancer Epidemiology Biomarkers & Prevention | Molecular Cancer Therapeutics |
| Molecular Cancer Research | Cancer Prevention Research |
| Cancer Prevention Journals Portal | Cancer Reviews Online |
| Annual Meeting Education Book | Cell Growth & Differentiation |