Purpose: The requirement of frozen tissues for microarray experiments limits the clinical usage of genome-wide expression profiling by using microarray technology. The goal of this study is to test the feasibility of developing lung cancer prognosis gene signatures by using genome-wide expression profiling of formalin-fixed paraffin-embedded (FFPE) samples, which are widely available and provide a valuable rich source for studying the association of molecular changes in cancer and associated clinical outcomes.
Experimental Design: We randomly selected 100 Non–Small-Cell lung cancer (NSCLC) FFPE samples with annotated clinical information from the UT-Lung SPORE Tissue Bank. We microdissected tumor area from FFPE specimens and used Affymetrix U133 plus 2.0 arrays to attain gene expression data. After strict quality control and analysis procedures, a supervised principal component analysis was used to develop a robust prognosis signature for NSCLC. Three independent published microarray datasets were used to validate the prognosis model.
Results: This study showed that the robust gene signature derived from genome-wide expression profiling of FFPE samples is strongly associated with lung cancer clinical outcomes and can be used to refine the prognosis for stage I lung cancer patients, and the prognostic signature is independent of clinical variables. This signature was validated in several independent studies and was refined to a 59-gene lung cancer prognosis signature.
Conclusions: We conclude that genome-wide profiling of FFPE lung cancer samples can identify a set of genes whose expression level provides prognostic information across different platforms and studies, which will allow its application in clinical settings. Clin Cancer Res; 17(17); 5705–14. ©2011 AACR.
This article is the first study to develop a robust prognosis signature for non–small cell lung cancer (NSCLC) on the basis of genome-wide expression profiling of clinically available formalin-fixed and paraffin-embedded (FFPE) samples. Although clinical FFPE tumor samples are widely available, the genome-wide expression profiling of FFPE samples has been hampered because of the degradation of RNAs extracted from them. In this article, we show that NSCLC FFPE-derived signature is strongly associated with clinical outcome of the patients, is independent of clinical prognostic variables, and can be validated in several independent studies. We showed that, after strict quality control and analysis procedures, genome-wide profiling of FFPE samples can actually provide a unique opportunity to identify a set of genes whose expression level is less sensitive to the environmental changes. This gene signature is more robust across different platforms and studies, which is critical for the successful application of gene signatures in real clinical settings.
Lung cancer is the leading cause of death from cancer for both men and women in the United States and in most parts of the world, with a 5-year survival rate of 15% (1). Non–small-cell lung cancer (NSCLC) is the most common cause of lung cancer death, accounting for up to 85% of such deaths (2). Clinicopathologic staging is the standard prognosis factor for lung cancer used in clinical practice but does not capture the complexity of the disease so that heterogeneous clinical outcomes within the same stage are commonly seen. Several randomized clinical trials showed that adjuvant chemotherapy improves survival in resected NSCLC (3–7). The effect of adjuvant chemotherapy on prolonging survival is modest—only 4% to 15% improvement in 5-year survival—although such treatment is associated with serious adverse effects (6, 8). Therefore, it is of considerable clinical importance to have a robust and accurate prognostic signature for lung cancer, especially in early stage lung cancer to improve the current clinical decisions on whether an individual lung cancer patient should receive adjuvant chemotherapy or not.
Genome-wide expression profiles have been used to identify gene signatures to classify lung cancer patients with different survival outcomes (9–16). However, the requirement of frozen tissues for microarray experiments limits the clinical usage of these gene signatures. Furthermore, prognostic gene signatures for NSCLC developed by different groups show minimal overlap and are often difficult to reproduce by independent groups (17, 18). To address the problem of requirement for frozen issues, we designed this study to test the feasibility of developing lung cancer prognosis gene signatures by using genome-wide expression profiling of formalin-fixed paraffin-embedded (FFPE) samples, which are widely available and provide a valuable rich source for studying the association of molecular changes in cancer and associated clinical outcomes. We derived a prognosis signature for NSCLC from FFPE samples and validated it in several independent studies. To facilitate other researchers to reproduce all results in this study, we have provided a literate programming R package.
Materials and Methods
The overall study design and the flow chart of the derivation and validation of the robust gene signature are described in Figure 1. We randomly selected 100 NSCLC FFPE samples with annotated clinical information from the UT-Lung SPORE Tissue Bank from 2001 to 2005. From these samples, 75 samples passed the mRNA quality control criteria (Supplementary Methods). Among these 75 samples, 48 samples are adenocarcinomas and 27 are squamous cell carcinomas. The median follow-up time is 2.8 years and the maximum follow-up time is 6.9 years; the characteristics of these patients are summarized in Supplementary Table S1. The samples were obtained under approval of the Institutional Review Boards at MD Anderson Cancer Center.
Sample microdissection and RNA extraction
FFPE tumor specimens were cut into serial sections with a thickness of 10 μm. For the pathologic diagnosis, one slide was stained with H&E and evaluated by a pathologist. Other sections were stained with nuclear fast red (NFR; American MasterTech Scientific Inc.) to enable visualization of histology. Tumor tissue was isolated by using manual macrodissection when the tumor area was more than 0.5 × 0.5 mm2 or laser capture microdissection (P.A.L.M. Microlaser Technologies AG) in cases of smaller tumor areas. At least 50 mm2 of tumor tissue was collected from each FFPE block. The extraction of RNA from tissue samples was done by a proprietary procedure of Response Genetics, Inc. (United States Patent Application 20090092979) designed to optimize the yield of higher molecular weight RNA fragments from FFPE specimens.
Microarray data preprocessing and quality control
Total RNA was processed for analysis on the Affymetrix U133 plus 2.0 arrays according to Affymetrix protocols for first- and second-strand synthesis, biotin labeling, and fragmentation. The quality control procedure for microarray data analysis was based on the percentage of present calls calculated by the MAS5 package. We selected arrays with at least 15% of probe sets present; 55 of 75 arrays passed this quality control criterion and will be used for the analysis. We selected probe sets that are present on all 55 arrays; 1,400 genes passed this criterion. These 1,400 genes were referred as the robust gene set (RGS), because the mRNA expression of these genes is robust to FFPE processing. The 55 samples and the 1,400 genes were used to develop gene signatures.
After microarray analysis QC, we used the RMA background correction algorithm (19) to remove nonspecific background noise. A robust regression model (20) was fitted to the probe level data, and the fitted expression values for the probes at the 3′ end were used to summarize the probe set expression values. Quantile–quantile normalization was used to normalize all the arrays. Consortium microarray raw data (13) was downloaded from caArray database of the National Cancer Institute (NCI) and preprocessed by RMA background correction and quantile–quantile normalization. All gene expression values were log transformed (on a base 2 scale).
Supervised classification by using supervised principal component analysis
Classification was done by using supervised principal component analysis (21, 22), a widely used classification method in biomedical research (23–26). As a supervised classification method, each prediction model was trained in a training dataset and then the performance was tested in an independent test dataset. We used an R package (version 2.81), Superpc (version 1.05), to implement the prediction algorithm, and the default parameters were used. The implementation details can be found in the Supplementary Sweave Report. The training and testing sets for each prediction model are summarized in Supplementary Table S2.
Overall survival time was calculated from the date of surgery until death or the last follow-up contact. Survival curves were estimated by using the product-limit method of Kaplan–Meier (27) and were compared by using the log-rank test. The maximum follow-up time for the FFPE patient cohort is less than 7 years, whereas some patients in the consortium cohort have been followed for up to 17 years. To avoid the extrapolation of the prediction model, the comparison of survival time between predicted groups are truncated at 7 years. The analysis results without truncation can be seen in Supplementary Sweave Report. Univariate and multivariate Cox proportional hazards analysis (28) were also done, with survival as the dependent variable.
The robust gene set defines two tumor groups
The expression of these 1,400 genes divided the 55 patients into 2 groups on the basis of unsupervised clustering analysis (with Euclidean distance and complete linkage for the hierarchical clustering algorithm; Fig. 2). Interestingly, group 1 has significantly shorter survival time compared with group 2 (Fig. 2B; HR = 3.6, P = 0.017), and multivariate Cox proportional hazards analysis showed that the association between RGS groups and survival (P = 0.012) is independent of stage. Notably, group 1 was dominated by squamous cell carcinoma (23/28), whereas group 2 was dominated by adenocarcinomas (25/27; P < 0.0001; Supplementary Table S3). The other clinical characteristics including gender, age, and smoking status were not significantly different between the 2 groups. To explore whether the association between RGS groups and survival is due to the histologic difference between two groups, we drew Kaplan–Meier curves by both histology and RGS groups (Supplementary Fig. S1), and it shows clearly that RGS can distinguish high- and low-risk groups within both adenocarcinoma and squamous groups, indicating the association of RGS groups and survival is independent of histology groups.
We used gene set enrichment analysis to identify the enriched gene sets in both RGS groups. Interestingly, an estrogen receptor (ER)–negative signature in breast cancer (29) is enriched in RGS group 1, meanwhile, an ER-positive signature in breast cancer (29) is enriched in RGS group 2 (Fig. 2C and D), indicating the relationship between the ER signatures and the RGS groups. The other enriched gene sets are summarized in Supplementary Table S4; notably, genes enriched in group 1 are also enriched in mouse neural stem cells and embryonic stem cells.
Construct and validate RGS prognosis signatures
FFPE samples training to testing.
The strong associations between RGS groups and survival outcomes motivated us to explore whether RGS expression profile can be used to construct prognosis signature. We randomly divided 55 patients into training (25 samples) and testing (30 samples) sets and constructed a prediction model by using 1,400 robust gene expression values in the training set through a supervised principle component approach (21). Figure 3A shows that the predicted low-risk group has significant longer survival time than the predicted high-risk group (P = 0.013) in the testing set. To test whether this association was not random, we randomly split the data into training and testing sets 200 times, repeated the same prediction and testing procedures for each set, and found that the prognosis performance of RGS signature is significantly better than random (P = 0.02).
Frozen samples training to testing.
We then tested whether this robust gene set can be used to construct prognosis signature in frozen samples. The largest independent public available lung cancer microarray dataset is the recently published NCI Director's Consortium for study of lung cancer involving 442 resected adenocarcinomas (13). From that study, Affymetrix U133A microarray data for the 1,012 robust genes were excerpted with 388 less genes than our FFPE data because of the microarray platform difference. We used the same training and testing strategy as in the original analyses of these data (13) for constructing and validating prognosis signature through supervised principal component approach. The training set included samples from University of Michigan Cancer Center (UM) and Moffitt Cancer Center (HLM), and the testing set included the Memorial Sloan-Kettering Cancer Center (MSKCC) and Dana-Farber Cancer Institute (DFCI) samples. This analysis revealed that the predicted low-risk group has significant longer survival time than the predicted high-risk group (HR = 2.44, P = 0.00014) in the testing dataset (Fig. 3B).
FFPE to frozen samples and vice versa.
Next, we used our FFPE and the consortium datasets as frozen samples to investigate whether the predication model built from one type of sample can be validated in another type of sample. Again, the same supervised principal component method was used to construct the prediction model. The prediction model built from FFPE samples can significantly distinguish the high- and low-risk groups in frozen samples (Fig. 3C; HR = 1.95, P = 5.4 × 10−7), and the prediction model built from frozen samples can also distinguish the high- and low-risk groups in FFPE samples but with marginal significance (Fig. 3D; HR = 3.59, P = 0.068). We also tested the performance of FFPE prediction model on 4 individual datasets in consortium study and found that the predicted low-risk groups have longer survival time compared with the predicted high-risk groups for all sets: MSKCC dataset (median survival time 6.5 vs. 3.3 years; HR = 2.31, P = 0.0093), DFCI dataset (median survival time 5.9 vs. 0.9 years; HR = 2.62 P = 0.0076), HLM dataset (median survival time 3.4 vs. 2.2 years; HR = 1.25, P = 0.4) and UM dataset (median survival time 5.4 vs. 2.2 years; HR = 1.98, P = 0.0011; Supplementary Fig. S2). Next, we compared the performance of RGS signature with previous published lung cancer prognosis signatures by using the same consortium dataset as testing set. Shedden and colleagues (13) showed that the HRs for Method A signature (the best signature in their study) and Chen and colleagues (11) signatures range from 1.10 to 1.83 for the MSKCC test set, whereas the HR for our RGS signature is 2.89 on the same MSKCC test set. For the DFCI test set, the HRs range from 1.76 to 2.30 by using the published signatures, whereas the HR for our RGS signature on the same DFCI test set is 2.39. Therefore, the prognosis performance of RGS prognosis is at least as good as other published signatures in the microarray dataset.
The RGS prognosis signature is independent of clinical variables
To test whether RGS is an independent prognosis signature, we fitted a multivariate Cox regression model including RGS risk scores, age, gender, stage, smoking status, adjuvant chemotherapy usage, and clinical sites as covariables for the consortium dataset. The RGS risk scores were calculated from the prediction model built from the FFPE samples set. Table 1 shows that the RGS signature is significantly associated with the survival time after adjusting for other clinical variables (HR = 1.3, P = 0.007). Pathologic stages based on international staging system is the most widely used and important prognosis variable for lung cancer patients (30); here we tested whether RGS signature can further refine the prognosis within each stage. The RGS prognosis signature from FFPE samples was tested within each stage of the consortium dataset. The results show clearly that the RGS signature is significantly associated with survival outcome within each stage (Fig. 3E–G; HR = 1.54, P = 0.036 for stage I, HR = 1.81, P = 0.022 for stage II and HR = 1.90, P = 0.021 for stage III), indicating that the RGS signature can refine the prognosis for lung cancer patients. The RGS prognosis signature from FFPE samples was further tested for patients with or without adjuvant chemotherapy separately, and the results show that the RGS signature is significantly associated with survival for both groups (Supplementary Fig. S3A and B; HR = 1.95, P = 0.015 for patients with chemotherapy, HR = 1.99, P = 0.00062 for patients without chemotherapy).
Refine to 59-gene prognosis signature
Among all the RGS genes, 131 genes are associated with survival (P < 0.05) in the FFPE dataset, and 365 genes are associated with overall survival (P < 0.05) in the consortium dataset by univariate Cox regression analysis. There is significant overlap between these two gene lists (Fig. 4A; 59 common genes; P = 0.0008, hypergeometric test). More significant genes were found in the consortium data compared with the FFPE data, which is likely due to the larger sample size (n = 442) of the consortium dataset compared with the FFPE dataset sample size (n = 55). Surprisingly, HRs from the two datasets are very consistent with each other. All 59 genes have the same direction of effects (positive or negative) on the survival between the 2 datasets and the HRs from 2 datasets are highly correlated (Pearson's correlation = 0.86; Fig. 4B), indicating the high consistency of expressions of these genes across datasets. These results motivated us to hypothesize that these 59 genes (Supplementary Table S5) alone can be used for lung cancer prognosis. To test this hypothesis, we applied supervised principal component analysis to these 59 genes by using the FFPE dataset to construct a 59-gene prognosis signature. Because the selection of these 59 genes used information from both FFPE and consortium datasets, we used another 2 independent lung cancer datasets, including the Bild and colleagues (n = 111; ref. 9) dataset and the Bhattacharjee and colleagues dataset (n = 117; ref. 31) downloaded from the literature to validate our 59-gene signature. The 59-gene prediction model built from FFPE samples can significantly distinguish the high- and low-risk groups for both the Bhattacharjee and colleagues and Bild and colleagues datasets (Fig. 5A; HR = 1.81, P = 0.016 and Fig. 5C; HR = 2.10, P = 0.02, respectively). Furthermore, this signature can also significantly distinguish the high- and low-risk groups within stage I patients for both datasets (Fig. 5B and D), indicating that this 59-gene signature can refine the prognosis for lung cancer patients within stage I patients. Because of the small sample size for stage II and stage III patients in Bild and colleagues and Bhattacharjee and colleagues studies, the 59-gene prognosis signature was not tested for stage II and stage III patients. We also found that 59-gene prediction model built from the consortium dataset can also distinguish the high- and low-risk groups for the Bild and colleagues and Bhattacharjee and colleagues datasets (Supplementary Fig. S4A–D).
To understand the potential biological relevance of these 59 genes significantly associated with survival in the FFPE and consortium datasets, we used Ingenuity Pathway Analysis (IPA) to explore which known regulatory networks are enriched in this 59-gene set. IPA analysis revealed the most significant molecular networks to be cancer, tumor morphology, and respiratory disease. This network (Fig. 4C) includes 14 genes of the 59-gene set and is centered on transcription factors HNF4A, HNF1A, and ONECUT1 (HNF6A). This hepatocellular network has been implicated in hepatocellular carcinoma as determined by in vitro study (32) and molecular interactions in this network are putatively involved in lung cancer survival.
In this study, we tested the feasibility of deriving a lung cancer prognosis gene signature from FFPE tumor samples on the basis of genome-wide mRNA expression profiling. Although reverse transcriptase PCR methods have been used to measure gene expression level from FFPE samples (33–35), the selection of genes for testing are limited to the current knowledge base which is incomplete and inconsistent (36). Because of degradation and chemical alteration of RNA extracted from FFPE samples, the use of microarray analysis of gene expression from FFPE samples has been hampered (36). New technology and methodologies developed to extract RNA from FFPE samples coupled with new array platforms have made it possible to measure gene expression from FFPE samples (33, 37–40). A recent study showed the feasibility of using DNA-mediated annealing, selection, extension, and ligation arrays with 6,100 preselected genes to profile mRNA expression from hepatocellular carcinoma tissue (41). No prognosis signature for other types of cancer has been developed by using microarray analysis of gene expression from FFPE extracted RNA. In this study, we built a robust gene signature for NSCLC on the basis of microarray analysis of FFPE samples. We claim that this is a robust gene signature because it has been validated in 6 independent published datasets, including 4 sets from the consortium study and 2 additional studies from DFCI and Duke. We also built a prediction model by using the same set of robust genes from frozen samples and validated the model in both frozen and FFPE samples.
Most published gene signatures identified from different studies are usually very different and with little overlap. However, we found that there is significant overlap among the robust genes associated with survival outcomes between the FFPE dataset and the consortium dataset (P = 0.008). More impressively, the HRs, indicating the strength of the association of genes expression and survival time, are highly consistent between 2 independent datasets. Our interpretation for this consistency across studies is that the gene expression variation across studies is a major contribution to signature differences across studies. In this study, we used strict quality steps to exclude genes that were not expressed in our FFPE samples. This allowed for analysis of the remaining genes which had more stable expression patterns and were more robust to environment changes. Validation of our novel 59-gene signature prognostic for NSLC survival in 2 additional independent datasets further confirmed the robustness of these genes.
By grouping our RGS of 1,400 genes by gene expression, we found that the group expression levels correlated with survival. Interestingly, group 1 had a shorter survival and contained an ER-negative breast cancer signature. Group 2 had a longer survival and contained an ER-positive breast cancer signature. This correlation with ER status and survival has been shown previously in breast cancer and shown to have predictive power for prognosis (29). In addition to ER status, the RGS groups were separated by the presence of stem cell signatures (embryonic stem cell signature and neural stem cell signature), with group 1 (shorter survival) having 2 stem cell signatures, whereas group 2 (longer survival) did not. The embryonic stem cell signature has previously been shown to be associated with poor prognosis of NSCLC (42). In addition, in mouse models, a hematopoietic and neural stem cell–like signature in primary tumors has been shown to be a predictor of poor prognosis in 11 types of cancer, including lung (43). These ER status and stem cell signature data support our RGS expression groupings and their correlation with survival prognosis.
Besides the prognostic signature, the predictive signatures to determine the optimal chemotherapy regimen for individual patients also have tremendous clinical benefit. Tumor samples from clinical trials data are important to develop predictive signatures to reduce the selection bias for evaluating treatment efficacy within signature groups. However, very limited frozen tumor samples are available from completed clinical trials. Our study showed the feasibility of using FFPE samples for genome-wide mRNA profiling. Therefore, this study provides an important step to construct and validate predictive signatures for chemotherapy response by using the available FFPE samples from clinical trials in the future.
Disclosure of Potential Conflicts of Interest
No potential conflicts of interest were disclosed.
This study was supported in part by grants from the Department of Defense (W81XWH-07-1-0306 03 to J.D. Minna and I.I. Wistuba), the Specialized Program of Research Excellence in Lung Cancer Grant (P50CA70907 to J.D. Minna, J. Roth, and I.I. Wistuba), the NCI (1R01CA152301-01 to Y. Xie and I.I. Wistuba, Cancer Center Support Grant CA-16672), the NIH (5R21DA027592 to G. Xiao), and the NSF (DMS0907562 to G. Xiao).
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
Note: Supplementary data for this article are available at Clinical Cancer Research Online (http://clincancerres.aacrjournals.org/).
- Received January 26, 2011.
- Revision received June 10, 2011.
- Accepted June 29, 2011.
- ©2011 American Association for Cancer Research.