Purpose: Blood-based surrogate markers would be attractive biomarkers for early detection, diagnosis, prognosis, and prediction of therapeutic outcome in cancer. Disease-associated gene expression signatures in peripheral blood mononuclear cells (PBMC) have been described for several cancer types. However, RNA-stabilized whole blood–based technologies would be clinically more applicable and robust. We evaluated the applicability of whole blood–based gene expression profiling for the detection of non–small cell lung cancer (NSCLC).
Experimental Design: Expression profiles were generated from PAXgene-stabilized blood samples from three independent groups consisting of NSCLC cases and controls (n = 77, 54, and 102), using the Illumina WG6-VS2 system.
Results: Several genes are consistently differentially expressed in whole blood of NSCLC patients and controls. These expression profiles were used to build a diagnostic classifier for NSCLC, which was validated in an independent validation set of NSCLC patients (stages I–IV) and hospital-based controls. The area under the receiver operator curve was calculated to be 0.824 (P < 0.001). In a further independent dataset of stage I NSCLC patients and healthy controls the AUC was 0.977 (P < 0.001). Specificity of the classifier was validated by permutation analysis in both validation cohorts. Genes within the classifier are enriched in immune-associated genes and show specificity for NSCLC.
Conclusions: Our results show that gene expression profiles of whole blood allow for detection of manifest NSCLC. These results prompt further development of gene expression–based biomarker tests in peripheral blood for the diagnosis and early detection of NSCLC. Clin Cancer Res; 17(10); 3360–7. ©2011 AACR.
Our results show that gene expression profiles of whole blood samples can be used to detect non–small cell lung cancer in smokers. These results open the avenue for further development of gene expression–based biomarker tests in peripheral blood for the diagnosis and early detection of non–small cell lung cancer. In the future such a biomarker could be developed as a diagnostic tool supplementary to imaging.
Lung cancer is still the leading cause of cancer-related death worldwide. Prognosis has remained poor with a disastrous 2-year survival rate of approximately 15% due to diagnosis of the disease in late, that is, incurable stages in the majority of patients (1) and still disappointing therapeutic regimens in advanced disease (2). Thus, there is an urgent need to establish reliable tools for the identification of non–small cell lung cancer (NSCLC) patients at early stages of the disease, for example, prior to the development of clinical symptoms. Today, the only way to detect NSCLC is by means of imaging technologies detecting morphologic changes in the lung in combination with biopsy specimens taken for histologic examination. However, these screening approaches are not easily applied to secondary prevention of NSCLC in an asymptomatic population (3).
The use of surrogate tissue–based, for example, blood-based, biomarkers for NSCLC might therefore circumvent the known pitfalls of imaging technologies and invasive diagnostics (3, 4). Such biomarkers might be utilized to direct imaging-based and invasive screening approaches to only those individuals identified as potential NSCLC patients by biomarker screening.
Array-based assessment of disease-specific gene expression patterns in peripheral blood mononuclear cells (PBMC) have been reported for nonmalignant (5) and malignant diseases including renal cell carcinoma, melanoma, bladder, breast, and lung cancers (6–11). In some cases, gene expression profiles derived from PBMC were even suggested as promising tools for early detection (8, 11) or prediction of prognosis (6), albeit these findings have not yet been validated in independent studies. Furthermore, circumventing known pitfalls of analyzing PBMC in a clinical setting (12, 13) by using stabilized RNA derived from whole blood would further strengthen the validity of blood-based surrogate biomarkers for early diagnosis of lung cancer and other malignant diseases.
Using 3 independent datasets of patients and controls, we investigated the validity of whole blood–based gene expression profiling for the detection of NSCLC patients among smokers. We show that RNA-stabilized whole blood samples can indeed be used to identify NSCLC patients among hospital-based controls as well as healthy individuals.
Material and Methods
Cases and controls
NSCLC cases and hospital-based controls were recruited at the University Hospital Cologne and the Lung Clinic Merheim, Cologne, Germany. Healthy blood donors were recruited at the Institute for Transfusion Medicine, University of Cologne. From all individuals PAXgene-stabilized blood samples were taken for blood-based gene expression profiling. For all NSCLC cases blood was taken before chemotherapy. To establish and validate a NSCLC-specific classifier, 3 independent sets of cases and controls were assembled. The training set (TS) comprised 77 individuals; 35 of those represent NSCLC cases of stages I to IV admitted to the hospital with symptoms of NSCLC (coughing, dyspnea, weight loss, or reduction in general health state) and 42 were hospital-based controls with a comparable comorbidity but no prior history of lung cancer. The validation set 1 (VS1, n = 54) likewise contained 28 NSCLC cases of stages I to IV and 26 hospital-based controls. Overall, the hospital-based controls in TS and VS1 enclosed individuals suffering from advanced chronic obstructive pulmonary disease (COPD) as typically seen in a population of heavily smoking adults (TS, n = 7; VS1, n = 5). Other diseases such as hypertension (TS, n = 17; VS, n = 11) or other malignancies (TS, n = 10; VS1, n = 6) were also observed in the group of hospital-based controls. The validation set 2 (VS2, n = 102) contained 32 NSCLC cases that had documented stage I NSCLC and were diagnosed mostly during routine chest X-ray analyses or due to clinical workup of unspecific symptoms such as reduced general health status. All individuals had an Eastern Cooperative Oncology Group performance status of 0. In addition, VS2 contains 70 healthy blood donors without prior history of lung cancer. Detailed information on cases and controls are summarized in Table 1 and in Supplementary Table S1. The analyses were approved by the local ethics committee and all probands gave informed consent.
Blood collection, cRNA synthesis, and array hybridization
Blood (2.5 mL) was drawn into PAXgene vials. After RNA isolation, biotin-labeled cRNA preparation was carried out by using the Ambion Illumina RNA amplification kit (Ambion) or Epicentre TargetAmp Kit (Epicentre Biotechnologies) and Biotin-16-UTP (10 mmol/L; Roche Molecular Biochemicals) or Illumina TotalPrep RNA Amplification Kit (Ambion). Biotin-labeled cRNA (1.5 μg) was hybridized to Sentrix whole genome bead chips WG6 version 2 (Illumina) and scanned on the Illumina BeadStation 500x. For data collection, we used Illumina BeadStudio 22.214.171.124 software. Data are available at http://www.ncbi.nlm.nih.gov/geo/ (GSE12771).
For RNA quality control the ratio of the optical density (OD) at wavelengths of 260 and 280 nm was calculated for all samples, which was between 1.85 and 2.1. To determine the quality of cRNA, a semiquantitative reverse transcriptase PCR amplifying a 5′ and a 3′ product of the β-actin gene was used as previously described (14) and showed no sign of degradation with the 5′ and a 3′ product being present. All expression data presented in this article were of high quality. Quality of RNA expression data was controlled by different separate tools. First, we performed quality control by visual inspection of the distribution of raw expression values. Therefore, we constructed pairwise scatter plots of expression values from all arrays (R-project version 2.8.0; ref. 15). For data derived from an array of good quality, a high correlation of expression values is expected to lead to a cloud of dots along the diagonal. In all comparisons the r 2 was more than 0.95. Second, the present call rate was high in all samples. Finally, we conducted quantitative quality control. Here, the absolute deviation of the mean expression values of each array from the overall mean was determined (R-project version 2.8.0; ref. 15). In short, the mean expression value for each array was calculated. Next, the mean of these mean expression values (overall mean) was taken and the deviation of each array mean from the overall mean was determined (analogous to probe outlier detection used by Affymetrix before expression value calculation; ref. 16). The deviation was less than 28 for all samples.
An overview of the experimental design is depicted in Figure 1. Expression values were independently quantile normalized. The classifier for NSCLC was built and optimized on the basis of the TS (n = 77; 35 NSCLC cases stages I–IV and 42 hospital-based controls) by using a 10-fold cross-validation design. Briefly, TS was divided 10 times into an internal training and an internal validation set in a ratio of 9:1 (distribution to internal validation group, see Supplementary Table S1). In the internal TS, the differentially expressed genes between NSCLC cases and controls were calculated by a t test. Next, 36 different feature lists were extracted from this list of differentially expressed genes by 36 times sequentially increasing the cutoff of the P value (P = 0.00001, P = 0.00002, P = 0.00003, …, P = 0.08, P = 0.09, P = 0.1). Subsequently, for each of the resulting 36 feature lists, 3 different learning algorithms [support vector machine (SVM), linear discrimination analysis (LDA), and prediction analysis for microarrays (PAM)] were trained on the internal TS and used to calculate the probability score for each case of the respective internal validation set. This approach was repeated 10 times according to the 10 dataset splittings of this 10-fold cross-validation. For each of the 10 cross-validation steps the area under the receiver operator curve (AUC) was calculated for the internal validation set. For each of the 36 cutoffs the mean of the 10 AUCs was calculated. Each of the 10 split datasets was used once as internal validation set. The optimal cutoff P value of the t statistics and the optimal classification algorithm were selected according to the maximum mean AUC ever reached in all of the 3 algorithms (Fig. 2). We subsequently built a classifier by using the respective cutoff P value of the t statistics and the selected algorithm in the TS. To further control for overfitting (17), the classifier was validated in 2 independent validation sets [VS1, comprising 28 NSCLC cases (stages I–IV) and 26 hospital-based controls; VS2 comprising 32 NSCLC cases (stage I) and 70 healthy controls]. The AUC was used to measure the quality of the classifier. In addition, we determined a threshold of the test score in the TS to evaluate sensitivity and specificity in the validation sets. In order not to miss a potential case with NSCLC we maximized the sensitivity to detect NSCLC requiring a minimum specificity (18). This specificity was defined to be at least 0.5 in its 95% CI. Of note, the threshold fulfilling these criteria was determined in TS. Subsequently, all individuals in VS1 and VS2 reaching an equal or higher test score than the TS-based threshold score were diagnosed as NSCLC cases and all others were diagnosed as controls. The sensitivity and specificity of this diagnostic test and its 95% CI was estimated for VS1 and VS2 (19). In addition, we compared the probability scores to be a NSCLC case for each case and control by using t statistics. To test the specificity of the classifier the whole analysis was repeated thousand times by using random feature sets of equal size. For visualization of the test score obtained by the SVM algorithm we used the following transformation algorithm: log2 (score + 1) + 0.1.
To investigate gene ontology (GO) of transcripts used for the classifier we carried out GeneTrail analysis for over- and underexpressed genes (20). To this end, we analyzed the enrichment in genes in the classifier compared with all genes present on the whole array. We analyzed under- and overexpressed genes by using the hypergeometric test with a minimum of 2 genes per category.
In addition, we carried out data mining by gene set enrichment analysis (GSEA; ref. 21). As indicated, we compared the respective list of genes obtained in our expression profiling experiment with datasets deposited in the Molecular Signatures Database (MSigDB). The power of the gene set analysis is derived from its focus on groups of genes that share common biological functions. In GSEA an overlap between predefined lists of genes and the newly identified genes can be identified by using a running sum statistics that leads to attribution of a score. The significance of this score is tested by using a permutation design which is adapted for multiple testing (21). Groups of genes, called gene sets were deposited in the MSigDB and ordered in different biological dimensions such as cancer modules, canonical pathways, miRNA targets, and GO terms (http://www.broadinstitute.org/gsea/msigdb/index.jsp). In our analysis we focused on cancer modules. The cancer modules integrated into the MSigDB are derived from a compendium of 1,975 different published microarrays spanning several different tumor entities (22).
Establishment of a gene expression profiling-based classifier for blood-based diagnosis of NSCLC
The classifier was build on the basis of an initial TS containing 35 NSCLC cases of different stages (stage I, n = 5; stage II, n = 5; stage III, n = 17; stage IV, n = 8) and 42 hospital-based controls suffering in part from severe comorbidities such as COPD, hypertension, cardiac diseases, and malignancies other than lung cancer. We first evaluated 3 different approaches, namely SVM, LDA, and PAM to identify the best algorithm to build a classifier for the diagnosis of NSCLC in a 10-fold cross-validation design. To this end we used 36 different feature lists extracted from the list of differentially expressed genes according to 36 different cutoff P values of the t statistics. In this setting, the SVM algorithm performed best by reaching the highest AUC (mean AUC = 0.754) at a cutoff P value of the t statistics of 0.003 (Fig. 2A). Thus, for subsequent classification we applied SVM by using the 484 feature list obtained at a cutoff P value of the t statistics of P ≤ 0.003 for differentially expressed genes between cases and controls based on the entire TS. Fold changes of genes with most significant P values are shown in Figure 2B and all transcripts used in the classifier are summarized in Supplementary Table S2. We next maximized the sensitivity of the classifier requiring the 95% CI of the specificity to still contain 0.5. Using these criteria, the threshold of the test score was determined to be 0.082. At this threshold of the test score sensitivity was determined to be 0.91 (0.75–0.97) and the specificity 0.38 (0.23–0.54), that is, the 95% CI containing 0.5.
The diagnostic NSCLC classifier can be used to detect NSCLC cases in an independent validation set of NSCLC cases and hospital-based controls
First we validated whether the classifier can be used to discriminate NSCLC cases of early and advanced stages among hospital-based controls. Therefore, in the first independent validation set cases and controls were chosen in a similar setting as in the TS, that is, patients with NSCLC stages I to IV and clinical symptoms associated with lung cancer and hospital-based controls with relevant comorbidities (n = 26). The AUC for the diagnostic test of NSCLC in this first validation set was calculated to be 0.824 (P < 0.001; Fig. 3A). In addition, probability scores were significantly different between cases and controls (p < 0.001, t test). Using the threshold determined in TS we observed a sensitivity of 0.61 (range, 0.41–0.78) and a specificity of 0.85 (range, 0.64–0.95) in VS1. Regarding only patients with stage III/IV NSCLC (n = 20) in VS1, the sensitivity was 0.70 (range, 0.46–0.87) and the specificity 0.85 (range, 0.64–0.95; data not shown). We observed that 3 out of 3 stage I NSCLC cases had a low score in this cohort of patients with a high degree of comorbidity (Fig. 3E). Patients with NSCLC of advanced stages in VS1 were identified among hospital-based controls by using the threshold determined in the TS.
The diagnostic NSCLC classifier identifies stage I NSCLC patients in an independent second validation set comprising stage I NSCLC cases and healthy blood donors
After showing that the classifier can be used to detect NSCLC cases among individuals with comorbidities, we also investigated whether this test can be used to distinguish NSCLC cases presenting at stage I with no or only minor symptoms from healthy individuals. Therefore, we recruited a second independent validation set consisting of 32 NSCLC cases at stage I and 70 healthy blood donors (VS2). By applying the identical classifier to VS2 the AUC was determined to be 0.977 (P < 0.001; Fig. 3C). Again, the classifier was used as a diagnostic test thereby applying the TS-based threshold of the test score. At this threshold the sensitivity was 0.97 (0.82–0.99) and the specificity 0.89 (0.78–0.95). We also observed a highly significant difference in the probability values to be a NSCLC patient for cases in contrast to controls (P < 0.001, t test). Healthy controls without significant comorbidity (VS2) tend to have lower probability scores compared with hospital-based controls (VS1 and TS) although this finding was not statistically significant (Fig. 3E). Of note, the difference in the probability score between healthy controls and patients with stage I lung cancer is more pronounced compared with the difference of probability scores between patients with NSCLC stage III/IV and patients with a similar high load of comorbidity.
Permutation test to analyze the specificity of the classifier
To further underline the specificity of this classifier, we used 1,000 random feature lists, each comprising 484 features to likewise build a SVM-based classifier in the TS, which were then applied to VS1 and VS2, respectively. For VS1, the mean AUC obtained by using these random feature lists was 0.49 (range, 0.1346–0.8633) with only 2 AUCs being 0.824 or more, the AUC obtained by using the NSCLC-specific classifier (Fig. 3B). This corresponds to a P value of less than 0.002 for the permutation test further confirming the specificity of the NSCLC classifier. Similarly, by applying the permuted classifiers to VS2, only 1.8% of random feature lists lead to an AUC ≥ 0.977, the AUC obtained by using the NSCLC-specific classifier (Fig. 3D). Furthermore, by merging TS and VS1 and randomly generating new dataset splitting in TS′ and VS1′ it could be shown that highly specific classifiers can be built independently of the initial composition of the TS (data not shown). In conclusion, a NSCLC-specific blood-based classifier was build that was successfully used to identify NSCLC cases among hospital-based controls as well as NSCLC cases of early stage among healthy individuals.
Mining of expression profiles
Different strategies were used to analyze the biological significance of the extracted 484 features derived as classifier by the SVM approach. First, we used GeneTrail (20) to analyze an enrichment in GO terms of the genes associated with NSCLC in our study. We observed 112 GO categories showing a significant (false discovery rate–corrected P < 0.05) enrichment of genes in our extracted gene list, of which 25 were associated with the immune system (Supplementary Table S3). These data indicate an impact of immune cells to the genes involved in the classifier.
Next, we carried out a GSEA (21, 22) thereby focusing on cancer modules which comprise groups of genes participating in biological processes related to cancer. Initially, the power of such modules has been shown exemplarily for single genes such as cyclin D1 or PGC-1α (23, 24) and a more comprehensive view on such modules has been introduced recently (22). This comprehensive collection of modules allows the identification of similarities across different tumor entities such as the common ability of a tumor to metastasize to the bone, for example, in subsets of breast, lung, and prostate cancers (22). Overall, 456 such modules are described in the database spanning several biological processes such as metabolism, transcription, cell cycle, and others.
When analyzing the identified 484 NSCLC-specific features, 199 cancer modules including 26% of all NSCLC-associated modules were identified to show a significant enrichment. This indicates that genes used to build a classifier for NSCLC cases in our study represent, in part, a subset of biologically cooperating genes that are also differentially expressed in primary lung cancer.
To further investigate the specificity of the extracted list of 484 features obtained from our analysis for the classification of NSCLC, we also calculated the overlap between this extracted gene set and a set of genes differentially expressed in the blood of patients with renal cell cancer (7). No significant overlap was observed for both gene sets. Similarly, no overlap was observed between our NSCLC-specific gene set and gene sets obtained from blood-based expression profiles specific for melanoma (10), breast (8), and bladder (9). In summary, these data point to a NSCLC-specific gene set present in our classifier.
Using RNA-stabilized whole blood from smokers in 3 independent sets of NSCLC patients and controls, we present a gene expression–based classifier that can be used as a biomarker to discriminate between NSCLC cases and controls. The optimal parameters of this classifier were first determined by applying a classical 10-fold cross-validation approach to a TS consisting of NSCLC patients (stages I–IV) and hospital-based controls (TS). Subsequently, this optimized classifier was successfully applied to 2 independent validation sets, namely VS1 comprising NSCLC patients of stages I to IV and hospital-based controls and VS2 containing patients with stage I NSCLC and healthy blood donors. This successful application of the classifier in both validation sets underlines the validity and robustness of the classifier. Extensive permutation analysis by using random feature lists and the possibility of building specific classifiers independently of the composition of the initial TS further support the specificity of the classifier. We found no association between stage of disease and the probability score assigned to each sample. In addition, we observed no association between other cancers and the probability score of the controls (data not shown). But controls without documented morbidity (controls in VS2) tend to have lower probability scores to be a case as compared with controls with documented morbidity, although this was not statistically significant.
The gene set used to build the classifier was enriched in genes related to immune functions. We therefore postulate that the classifier is based on the transcriptome of blood-based immune effector cells rather than influenced by the occurrence of rare tumor cells occasionally detected in blood of cancer patients, although this possibility cannot be ruled out (25). Moreover, the lack of NSCLC tumor cell–specific transcripts, for example, thyroid transcription factor (TTF1), cytokeratins, or human telomerase reverse transcriptase in our classifier points into the same direction. GSEA (9) of the gene set used in our diagnostic approach in comparison with published expression datasets from a variety of cancer entities (22) revealed an significant overlap with 26% of the lung cancer tissue–specific gene expression profiles. As NSCLC tissue consists of tumor cells, immune cells, and stromal cells (10), we presume that the similarities of both gene sets is due to a similar regulation of genes present in immune cells in NSCLC tissue and peripheral blood of NSCLC patients. These findings are in line with the data showing tumor-induced alteration of the immune system in mice (6, 7) and in men (11, 26).
Recently, Showe and colleagues (11) reported a NSCLC-associated gene expression signature derived from PBMC of predominantly early-stage NSCLC patients. Also, an enrichment of immune-associated pathways in the signature was observed in this study, further indicating that the alteration of the immune system might be a common feature already during the initial phase of NSCLC development. As we used RNA-stabilized whole blood and not PBMC for analysis, we were not surprised that the signature identified by Showe and colleagues could not be used in our dataset to distinguish between cases and controls. The same holds true when applying our classifier to the published dataset (Zander and Schultze, unpublished data). Findings derived from several of our own studies further underline that signatures derived from PBMC and RNA-stabilized whole blood samples cannot be directly compared (refs. 12, 13); Schultze, unpublished data). However, as previously shown by us and others, for clinical applicability and robustness we would favor RNA-stabilized approaches because these methods reveal more reliable results in a multicenter setting (13).
Overall, our data show the feasibility of a diagnostic test for NSCLC based on RNA-stabilized whole blood. Our findings form the basis for validation studies in a multicenter setting in prevalent NSCLC patient cohorts enriched for early-stage disease. In the end, this endeavor might open the avenue to test the blood-based NSCLC classifier in prospective trials to evaluate the predictive potential of diagnostic classifiers for NSCLC in high-risk individuals.
Disclosure of Potential Conflicts of Interest
No potential conflicts of interest were declared.
J.L. Schultze and J. Wolf were supported by the Helmholtz-Gemeinschaft (VH-VI-143). J.L. Schultze was also supported by the Humboldt-Foundation (Sofja Kovalevskaja award) and a Köln Fortune grant. J. Wolf and R.K. Thomas were supported by the NGFNplus-program of the German Ministry of Science and Education (BMBF; Grant 01GS08100). A. Staratschek-Jox was supported by the Monika Kutzner Stiftung.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
We thank Julia Classen for experimental assistance.
Note: Supplementary data for this article are available at Clinical Cancer Research Online (http://clincancerres.aacrjournals.org/).
Ethics approval: The study was approved by the Ethics Committee of the University of Cologne.
- Received March 2, 2010.
- Revision received January 12, 2011.
- Accepted February 15, 2011.
- ©2011 American Association for Cancer Research.