
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
Molecular Oncology, Markers, Clinical Correlates |
Vanderbilt-Ingram Cancer Center and Department of Medicine [N. Y., K. Y., T. P. D., S. N., D. P. C.], Department of Preventive Medicine [P. L., Y. S.], Department of Pathology [M. E., A. G., R. J.], Department of Cardiac and Thoracic Surgery [J. R. R.], and Department of Molecular Physiology and Biophysics [S. L., J. H. M.], Vanderbilt University School of Medicine, Nashville, Tennessee 37232-6838; Cardiovascular Surgical Associates, Saint Thomas Hospital, Nashville, Tennessee 37205 [J. C. N.]; and Hamon Center for Therapeutic Oncology Research, University of Texas Southwestern Medical Center, Dallas, Texas 75235 [J. D. M.]
ABSTRACT
Purpose:RNA expression patterns associated with non-small cell lung cancer subclassification have been reported, but there are substantial differences in the key genes and clinical features of these subsets casting doubt on their biological significance.
Experimental Design: In this study, we used a training-testing approach to test the reliability of cDNA microarray-based classifications of resected human non-small cell lung cancers (NSCLCs) analyzed by cDNA microarray.
Results: Groups of genes were identified that were able to differentiate primary tumors from normal lung and lung metastases, as well as identify known histological subgroups of NSCLCs. Groups of genes were identified to discriminate sample clusters. A blinded confirmatory set of tumors was correctly classified by using these patterns. Some histologically diagnosed large cell tumors were clearly classified by expression profile analysis as being either adenocarcinoma or squamous cell carcinoma, indicating that this group of tumors may not be genetically homogeneous. High
-actinin-4 expression was identified as highly correlated with poor prognosis.
Conclusions: These results demonstrate that gene expression profiling can identify molecular classes of resected NSCLCs that correctly classifies a blinded test cohort, and correlates with and supplements standard histological evaluation.
INTRODUCTION
Lung cancer represents a challenging clinical problem in most of the developed countries. The number of deaths from lung cancer in the United States is more than the next four most common cancers combined. Despite the best current treatment, the overall 5-year survival after diagnosis is only 1015%. Improvements in prevention, early detection, prognosis, and therapy have been difficult to achieve. Clinically, lung cancers display a broad range of clinical behaviors ranging from slowly progressing to rapidly fatal, they can be highly metastatic or only locally invasive, and they may display responsiveness or resistance to therapy (1) ; the molecular basis of these variations in behavior is completely unknown.
The classification of lung cancers has traditionally been based primarily on light microscopic morphological findings. According to the current histological lung cancer classification proposed by the WHO in 1981, lung cancers can be divided into two broad groups, small cell lung cancer, accounting for 2025% of bronchogenic carcinomas, and NSCLC,2 accounting for almost all of the remaining cases. NSCLC has three major subgroups: adenocarcinoma, squamous cell carcinoma, and large cell carcinoma (2) . Even within the subgroup of NSCLC there is a great degree of heterogeneity in behavior, and the histological subclassifications for NSCLCs have no predictive use and all are treated identically despite decades of research.
It is clear that each tumor has unique genetic differences, and it is hypothesized that these differences determine its biological behavior. A large effort has been made by many laboratories to study many individual candidate genetic abnormalities in an attempt to develop molecular markers for lung cancer classification and prognosis, but after hundreds of such studies, none of these single markers are of any real clinical utility. Even today, all NSCLCs are usually treated identically, stage for stage, and no molecular marker is used for routine therapeutic decisions. Thus, it is becoming clear that complex biological behaviors of tumors will only be explainable by complex patterns of multiple markers.
Microarray technology has enabled expression analysis of thousands of genes at one time, allowing insight into complex gene expression patterns and perturbations (3) . To date, microarray technology has been successfully applied to a wide variety of malignant diseases, such as leukemia, lymphoma, colon cancer, melanoma, ovarian cancer, breast cancer, hepatocellular carcinoma, and prostate cancer (4, 5, 6, 7, 8, 9, 10) . These studies have succeeded in identifying dozens of crucial genes that are up- or down-regulated in certain types of malignant cells or tissues. In lung cancer, several groups have reported microarray-based subclassifications of lung adenocarcinomas, but these studies differ from each other in significant ways, and none of these studies have tested their patterns by using blinded sets of tumors (11, 12, 13) .
To begin to explore lung cancer molecular profiles that predict biological behavior, e.g., histological subtype or survival of lung cancer patients, we applied cDNA microarray technology to the study of a set of freshly resected human lung cancers and used multiple statistical methods to correlate differentially expressed genes with histology and clinical outcome. We then successfully tested this classification pattern with an independent cohort of tumors.
MATERIALS AND METHODS
Tissues and Cell Lines.
Lung cancer, normal lung, and nonlung tumor tissues in excess of what was necessary for diagnostic purposes were obtained <15 min after removal from the patient and placed in RNAlater (Ambion, Austin, TX) before being snap frozen in liquid nitrogen. Before and independent of the molecular analysis, the lung cancer patients were assigned a Tumor-Node-Metastasis postsurgical stage score according to the current international lung cancer staging system, and then classified histologically into adenocarcinoma, squamous cell carcinoma, large cell carcinoma, and small cell carcinoma using standard WHO criteria (2)
. The cancers were additionally classified as well, moderately, or poorly differentiated. A pool of RNAs isolated from six different lung cancer cell lines was prepared as a common reference to represent the common histological subtypes of lung tumor. This RNA pool provided us with a large amount of renewable and consistent RNA likely to contain the vast majority of genes expressed in lung cancer and enabled us to compare gene expression patterns across tumor samples. Six lung cancer cell lines of different histological subtypes, H23, H69, H157, H441, H596, and H727, were selected and purchased from American Type Culture Collection. These cell lines were all tested to be negative for Mycoplasma contamination. The cell lines were cultured in RPMI 1640 supplemented with 10% fetal bovine serum (Hyclone, Logan, UT) under exponential growth conditions at 37°C in 5% CO2 until 7080% confluent in 150 cm2 flasks. Refer to the supplementary information on our website for more information.3
RNA Preparation.
Cell lines were washed once with PBS and immediately processed for RNA extraction. Cell lines and pulverized snap-frozen tumor tissues were lysed in TRIzol (Life Technologies, Inc., Rockville, MD), and total RNAs were extracted. The RNA was purified by RNeasy (Qiagen, Valencia, CA) according to the manufacturers instructions and stored at -80°C. The quantity and the quality of RNA were assessed by UV spectrometer and electrophoresis on a 1% formaldehyde agarose gel.
Microarray Preparation.
5184 cDNA inserts were PCR amplified from the sequence verified human clones purchased from Research Genetics (Rockville, MD) to represent 4827 unique genes. Aliquots of all of the PCR products were examined by agarose gel electrophoresis. The 681 products that did not amplify or contained multiple bands were not used for data analysis. For arraying, the PCR amplified cDNA inserts were resuspended in 3x SSC and arrayed onto the poly-L lysine coated glass slides by Stanford type microarrayer robot at the Vanderbilt Microarray Shared Resource. Printed slides, VMSR human 5k cDNA microarrays, were postprocessed by UV exposure. The complete gene list and chip production information is available on our web site.4
Just before hybridization, slides were prehybridized in 5x SSC, 0.1% SDS, and 1% BSA (Sigma, St. Louis, MO) for 45 min at 42°C. After washing five times by dipping in MilliQ purified water at room temperature, slides were dipped into isopropanol and air-dried.
Microarray Analysis.
Fluorescently labeled cDNAs were made from 50 to 100 µg of sample and reference total RNA through an anchored oligodeoxythymidylic acid [5'-T(20)VN-3' (V = any nucleotide except T, n = any nucleotide)] primed reverse transcriptase reaction. The labeling reactions were done in the presence of 100 ng/µl anchored primer, 200 µM each of dATP, dGTP, and dTTP; 120 µM dCTP; 10 mM DTT; 200 units of SuperScript II (Life Technologies, Inc.); and 120 µM Cy3-dCTP or Cy5-dCTP (Amersham Pharmacia, Piscataway, NJ) in a 30-µl solution. After hydrolyzing the RNA template with NaOH, the labeled single-stranded cDNA was purified by QIAquick PCR purification kit (Qiagen) according to the manufacturers instructions. The purified cDNA was dried in a SpeedVac and resuspended in hybridization solution [3x SSC/0.2% SDS/1 µg/µl yeast tRNA/1 µg/µl poly(dA)], heat denatured, applied to the slide, and sealed under a coverslip. The slide was placed in a humidified chamber at 65°C for 1416 h. After hybridization, the slide was washed in 2x SSC/0.1% at 55°C for 5 min, 1x SSC at room temperature for 5 min, and 0.1x SSC at room temperature for 5 min. The slide was dried and scanned by a GenePix 4000B scanner (Axon Instruments, Inc., Foster City, CA). The resulting image was analyzed using GenePix Pro software (Axon Instruments, Inc.). To assay for artifacts caused by specific dye combinations, we performed two reciprocal hybridizations on different arrays for each sample (switching the dyes between test sample and reference cDNAs). When extracting the data from the original pictures captured by the scanner, spots with obvious blemishes or spots of which the diameters were <60 µm were flagged out at the initial step and excluded for additional analysis. Also, 96 spots that had not been assigned to any unigene cluster were excluded from this analysis. Normalization was performed based on the premise that the arithmetic mean of the ratios from every spot that performed well should be equal to 1 (14)
. In brief, nonflagged spots that had a ratio between 0.1 and 10 were selected. The log value for the expression ratio of each selected spot was determined. The average of all log values (AvgLog) was calculated. The normalized ratio for each spot was calculated as the normalized ratio = original ratio:10AvgLog. The data from spots of which the fluorescence intensity in each channel was <1.4 times the local background were deleted for subsequent analysis. We also excluded data that was only available from one of the reciprocal experiments or where data from reciprocal experiments differed by >2-fold. After this data filtering, we took the average of the data from the reciprocal experiments in the logarithmic field. Finally, data from spots that had >70% of data present across the samples in the training cohort were used for additional statistical analysis. Thus, in the final data table, we had data from 3811 spots that represented 3647 genes. We used every gene on this table for the search for differentially expressed genes among the training sample subtypes. The original data tables and more information regarding with our analysis are available as the supplementary information on our website.3
Statistical Data Analysis.
The statistical analyses for the microarray data were focused on the following steps: (a) selecting the important genes that were differentially expressed among the histological groups; (b) using the class prediction model based on the WFCCM (15
, 16) to verify if the genes selected in step one have the statistical significant prediction power on the training samples; (c) applying the prediction model generated from step two to a set of blinded samples for examining the prediction power on the blinded samples; and (d) using the agglomerative hierarchical clustering algorithm (17)
to investigate the pattern among the statistically significant discriminator genes as well as the biological status.
The selection of important genes was based on SAM (18) , Weighted Gene Analysis (10) , and the t test, and the cutoff points for each method were 3.7, 3.0, and P < 0.0001, respectively. The cutoff points were determined based on the significance as well as the prediction power of each method. The gene was on the final list if it met at least one of these three selection criteria.
The WFCCM (15
, 16)
was used in the class-prediction model based on the selected genes. This method was designed to combine the most significant genes associated with the biological status from each analysis method, e.g., SAM, Weighted Gene Analysis, Info Score (20)
, and t test. In other words, the WFCCM is an extension of the compound covariate method, which allows for the consideration of more than one statistical analysis method into the compound covariate, and it reduces the dimensionality of the problem using a new covariate obtained as a weighted sum of the most important predictors. The WFCCM for tumor sample i is defined as WFCCM(i) =
j[
k(STjk)] [Wj][xij], where xij is the log-ratio measured in tissue sample i for gene j. STjk is the standardized statistic, e.g., t-statistic, for statistical analysis method k. Wj is the weight of gene j, which is defined as Wj = [(
k Ijk/K) (1 - Info Scorej)], where Ijk = 1, if the gene j was statistically significant in method k; and Ijk = 0, if the gene j is not statistically significant in method k.
The class-prediction model was applied to determine whether the patterns of gene expression could be used to classify tissue samples into two classes according to the chosen parameter, e.g., normal tissue versus tumor tissue. We estimated the misclassification rate using leave-one-out cross-validated class prediction method based on the WFCCM. This leave-one-out cross-validated method was processed in four steps. First, WFCCM was applied to calculate the single compound covariate for each tissue sample based on the significant genes. Second, one tissue sample was selected and removed from the data set, and the distance between two tissue classes for the remaining tissue samples was calculated. Third, the removed tissue sample was classified based on the closeness of the distance of two tissue classes. Fourth, steps 2 and 3 were repeated for each tissue sample. To determine whether the accuracy for predicting membership of tissue samples into the given classes (as measured by the number of correct classifications) was better than the accuracy that could be attained for predicting membership into random grouping of the tissue samples, we created 5000 random data sets by permuting class labels among the tissue samples. The cross-validated class prediction was performed on the resulting data sets, and the percentage of permutations that resulted in as few or fewer misclassifications as for the original labeling of samples was reported. If <0.05 of the permutations resulted in as few or fewer misclassifications, the accuracy of prediction into the given classes was considered significant.
The prediction of the blinded samples was completed using the method described above. The blinded sample was classified based on the closeness of the distance of two tissue classes, which was determined using the WFCCM.
The agglomerative hierarchical clustering algorithm (17) was applied to investigate the pattern among the statistically significant discriminator genes as well as the biological status using the software of Eisen et al. (21) .
Survival was estimated with the Kaplan-Meier method, and differences between groups were compared with the log-rank test.
Sequence Verification of cDNAs.
For the differentially expressed genes, the cDNA inserts were verified by DNA sequencing using vector-specific modified M13 primers, forward primer (5'-GTTTTCCCAGTCACGACGTTG-3') or reverse primer (5'-TGAGCGGATAACAATTTCACACAG-3'). Cycle sequencing reactions were performed with fluorescent-labeled nucleotides at the DNA sequencing shared resource in Vanderbilt-Ingram Cancer Center. Sequence database searches were performed with Basic Local Alignment Search Tool (BLAST) sequence comparison programs at National Center for Biotechnology Information.5
RESULTS
Initial Data Analysis and Sample Re-Evaluation.
We first sought to establish whether molecular profiling of our tumor set could identify genes of which the expressions correlated with known light microscopic histological subgroups of lung cancer. Differentiation of NSCLC from small cell lung cancer or lung primary from metastasis to the lung from other organs is of significant clinical interest and sometimes problematic in practice. The ability of this technique to identify patterns of genes associated with these known histological subgroups would also serve as a useful proof of principle. For this purpose, we analyzed 26 resected primary lung cancers, 3 normal lung tissues, and 2 metastatic lung tumors as our training cohort (Table 1
, Supplementary Table 1
).3
We compared the gene expression profiles of the following sample group pairs: normal lung tissues to tumor tissues, normal lung to primary lung tumors, and normal lung to metastatic lung tumors and NSCLCs. Furthermore, within the NSCLC group, we compared adenocarcinomas to nonadenocarcinomas, squamous cell carcinomas to nonsquamous cell carcinomas, and large cell carcinomas to non-large cell carcinomas. According to our statistical criteria (P
0.0001 or absolute value of SAM
3.75), we were able to identify groups of genes of which the expression level best segregated samples into these groups except for the large cell carcinoma category (Supplementary Fig. 1, AD
, right gene lists).3
|
|
Slides of the samples (samples 32, 5, 22, 4, 19, and 23) that had discrepancies between histological classification and gene expressional classification were re-examined by a pathologist. Several light microscopic misclassifications were identified. According to histological re-evaluation, in sample 32, there was only
15% of tumor tissue, and the other 85% consisted of fibrous tissue, explaining its clustering with the normal samples. Sample 5 showed nests with central necrosis suggesting cells with squamous differentiation. However, most of the tumor in this sample was very poorly differentiated lung cancer without intracellular bridges and without cytoplasmic keratinization, such that this sample might be better classified as large cell lung cancer. We also found that sample 22 was better classified as large cell carcinoma on review.
Through this initial analysis, we recognized the need for careful reverification of standard clinical histology and information on our samples, and all of the samples were rereviewed without finding any more major histological changes. In our review of clinical information, we found that sample 34 and sample 43 had received chemotherapy before surgery. For additional analysis, these two squamous cell carcinoma samples were analyzed in test cohort separately from other nonpretreated samples (Supplementary Table 1
).3
Identification of Genes Expressed in Sample Groups.
We reanalyzed the entire data set using the revised sample information (Supplementary Table 1).3
Again, we sought genes of which the expression levels best segregated samples into histological groups using the same statistical methods. Through this analysis, in most of the histological group comparisons, we identified a larger number of significant genes (Fig. 2, AE
; Supplementary Table 6).3
When we compared large cell lung carcinoma to non-large cell carcinoma in NSCLCs, we identified 2 statistically significant genes. In addition to the comparisons used in the initial analysis, we compared 3 more sample combinations within NSCLCs: adenocarcinomas to squamous cell carcinomas, adenocarcinomas to large cell carcinomas, and squamous cell carcinomas to large cell carcinomas. Although we could identify 27 genes of which expression levels differentiate adenocarcinoma from squamous cell carcinoma (Fig. 2F)
, we could not identify any gene of which the expression level differentiated large cell carcinomas from adenocarcinoma or large cell carcinomas from squamous cell carcinoma. The expression differences of several of these genes were also confirmed by Northern blotting (Supplementary Fig. 2, A and B).3
|
|
Hierarchical cluster analysis showed clearer clustering for each histological group compared with the initial analysis (Fig. 2, AD)
. However, large cell carcinomas were again difficult to cluster into one group. When we clustered NSCLC samples according to the expression levels of genes of which the expression patterns correlated with adenocarcinoma, large cell carcinoma samples 4 and 5 clustered closely with adenocarcinomas (Fig. 2C)
. When we clustered NSCLC samples according to the expression levels of genes of which the expression patterns correlate with squamous cell carcinoma, large cell carcinoma samples 6, 14, and 19 clustered with squamous cell carcinomas (Fig. 2D)
. Even when we used the expression patterns of 2 genes identified to correlate significantly with large cell carcinoma clustering, the 2 large cell carcinoma samples 5 and 19 clustered with non-large cell carcinomas (Fig. 2E)
. Overall, our class prediction model system could classify almost all of the samples in the training cohort correctly except sample 5 (a large cell carcinoma), sample 23 (a squamous cell carcinoma), and sample 19 (a large cell carcinoma; Table 2
).
|
|
Identification of Genes That Relate to Clinical Behavior.
Finally, we identified genes that correlated with the biological behavior of lung cancers. For this analysis, we used two kinds of clinical information, postoperative nodal metastatic status and overall survival (Supplementary Table 7).3
In the first analysis, we looked for genes that were differentially expressed between the nodal metastasis negative (N0, n = 17) and positive groups (N1 or N2, n = 14) in NSCLC samples (n = 31, data from sample 47 was removed through this analysis because of the duplication with sample 6; Supplementary Table 8).3
No gene was identified as differentially expressed between these groups in this dataset. Next, we tested for genes of which the expression level correlated with overall survival. When we compared the group with overall survival
1 year (n = 19) to the group who died within 1 year (n = 4; Supplementary Table 8),3
we identified one gene, ACTN4 gene (H50993), as highly expressed in the poor survival group. Then we compared the survival of ACTN4 low-expression group to that of ACTN4 high-expression group (Supplementary Table 9).3
As is shown in Fig. 3
, the expression level of ACTN4 gene was a significant prognostic predictor of overall survival.
|
DISCUSSION
We applied gene expression profiling approach to lung cancer classification. In this study, we used supervised methods to identify genes that correlated with certain biological features of a training cohort of tumors. Then, we attempted to examine the relationship between identified gene expression patterns and sample histologies by two statistical methods, hierarchical clustering analysis and a class prediction model system. A blinded confirmatory set of samples in test cohort was used to confirm our findings.
Three recent studies have analyzed lung cancer gene expression profiles by microarray technology (11, 12, 13) . In two of these studies (11 , 12) , the investigators selected genes for analysis of which the expression was most similar within duplicate experiments yet varied widely among the other tumor samples. Then, using gene expression profiles of the selected genes across their tumor set, they clustered the samples by an unsupervised method to identify potentially novel classes in lung cancers. Each group discovered different candidate lung cancer subtypes within adenocarcinoma. Although the statistical approaches we used were different, several genes we identified as discriminators of histological groups were contained with the set of discriminant genes reported in those two papers. For example, four and a half LIM domain 1 was highly expressed in normal lung, and keratin 5 and bullous pemphigoid antigen 1 were highly expressed in squamous cell carcinoma in our study as was reported by Bhattacharjee et al. (12) Folate receptor, KIAA 1319 protein, and mucin 1 were highly expressed in adenocarcinoma compared with squamous cell carcinoma in our study as was also reported by Garber et al. (11) . Besides these previously identified genes, we identified many genes that had not been reported previously to be differentially expressed in lung cancers. The expression level of these genes or the proteins encoded by these genes may be useful as novel biomarkers.
In our analyses, the large cell carcinoma group was extremely difficult to cluster into one group by its gene expression profile. When we attempted to identify the genes that correlated with large cell carcinomas, no or only a few genes were identified as candidates. Also, when we attempted to examine the relationships between gene expression patterns and sample histologies by using hierarchical clustering analysis and a class prediction model, large cell carcinomas were outliers. For example, large cell carcinoma samples 5 and 19 had gene expression profiles quite similar to adenocarcinoma and squamous cell carcinoma, respectively. These data suggest that poorly differentiated tumors by light microscopic evaluation may be more related to tumors from either of these two groups than to each other. In the article from Garber et al. (11) , they had four pure large cell carcinomas in their clustering analysis across 73 lung samples. Of those four large cell carcinomas, three large cell lung carcinomas clustered with adenocarcinomas to make a large cell carcinoma cluster, and the remaining large cell carcinoma clustered with adenocarcinomas in one of the adenocarcinoma subgroups, adeno group 3 cluster. The numbers of large cell lung cancers analyzed by microarray technology is still too small to make clear conclusions as to whether large cell carcinoma is a genetically distinct group or not, whereas our data suggest not. Greater numbers of large cell carcinomas with complete clinical information need to be analyzed to answer this question.
Our study is unique in that we used a blinded test cohort to confirm our prediction model system. In this test cohort, we had 3 kinds of metastatic tumors that were not in our training cohort, a colon adenocarcinoma (sample 48), an adrenal tumor (sample 75), and a hepatocellular carcinoma (sample 76). Our prediction model accurately predicted 2 of these, the adrenal tumor and hepatocellular carcinomas, as nonprimary lung tumors. However, our prediction model predicted the colon adenocarcinoma as a primary lung adenocarcinoma. Interestingly, this tumor arose in the lung of a patient cured a decade earlier of lymphoma, and was initially diagnosed as a lung cancer. After its resection, however, the patient developed widespread metastatic disease, including a lesion in the colon, and was clinically felt to represent a metastatic colon cancer, although this is not clinically indisputable.
We also had 2 tumors from patients who had received preoperative therapy in our test cohort. Both of them had radiation and chemotherapy before surgery, and sample 34 did not respond significantly to this therapy, whereas tumor 43 responded well to treatment. Our prediction model classified sample 34 as primary lung carcinoma, but sample 43 was classified as a non-lung primary metastasis. This might reflect the expression changes associated with response to treatment and deserves additional investigation. Much more refinement of the model with a variety of lung and non-lung tumors, and with larger arrays will be required to obtain a more accurate histological prediction model.
Whereas an interesting proof of principle, the potential clinical benefit of this technology will lie in its ability to predict the biological and clinical behavior of tumors. To identify such patterns, a supervised statistical method is required. In our study, we found the expression level of a novel marker ACTN4 was a significant prognostic predictor in these lung cancer patients. The ACTN4 gene product was originally identified through immunoscreening of monoclonal antibodies reactive with proteins up-regulated upon enhanced cell movement (25) . Recently, Beer et al. (13) identified sets of genes of which the expression profiles were correlated with survival of stage I patients with lung adenocarcinomas. They chose these genes by leave-one-out and training-testing cross-validation methods, and used their model to predict survival. Unfortunately, the gene we identified as most correlated with survival, ACTN4, was not included in their list of 4966 analyzed genes. Therefore, we cannot determine the significance of ACTN4 in their data set. Although statistically significant, because our sample size was small and had only a short follow-up interval, the significance of ACTN4 expression in lung cancer survival needs to be reconfirmed by a larger cohort with a longer follow-up interval.
There has been a concern about potential errors in cDNA clones for microarray production (19) . To increase our confidence in our findings, we sequenced every cDNA clone identified as important for each of the classes in our study. Eight clones of 172 (4.7%) showed incorrect sequences. This error rate is much lower than other studies reported previously and suggests that sequence verification is indispensable for verifying microarray data. Unfortunately, the data generated by the Beer et al. (13) and Bhattacharjee et al. (12) studies cannot be confirmed in this way because of limitations in proprietary oligo array technology.
Our results thus confirm that gene expression profiling may be an efficient tool for classifying lung tumors into biologically important and/or prognostic groups, and identifying genes associated with those distinctions. We demonstrate that these classifications can successfully classify blinded test cohorts of tumors. Using a small set of tumors and only limited clinical information, we successfully identified sets of genes distinguishing cancer from normal lung tissue, lung primary from lung metastasis, and with the known light microscopic histological groups. Our data also provide evidence that large cell carcinomas may not be a genetically distinct group, but may often represent tumors more closely related to adenocarcinomas or squamous cell carcinomas than to each other. In a few cases in our study, genetic predictions did not agree with the findings of traditional light microscopic evaluation and identified misclassifications in the original pathology reports. An unsuspected association of ACTN4 expression with survival was also identified and should be investigated additionally in larger numbers of patients. It is hoped that in the future, genetic classification will indicate other novel features of these tumors that were previously undetectable by standard procedures. The ability of our classification to correctly identify blinded tumor samples suggests that the patterns we and others are observing may have real biological significance.
ACKNOWLEDGMENTS
We thank Mark McQuain, Melanie Robinson, Vicky Amann, and Dr. William Grady at the Vanderbilt University Medical Center, and Drs. Adi F. Gazdar, Shinichi Toyooka, and Kiyomi O. Toyooka at University of Texas Southwestern Medical Center for helpful support and thoughtful suggestions.
FOOTNOTES
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
Supported by Lung Cancer Special Program of Research Excellence P50CA90949, P50CA70907, Mathers Foundation, and the Robert A. and Helen C. Kleberg Foundation.
1 To whom requests for reprints should be addressed, at Division of Hematology and Oncology, Vanderbilt-Ingram Cancer Center, 685 Preston Research Building, Nashville, TN 37232-6838. Phone: (615) 936-3321; Fax: (615) 936-3322; E-mail: d.carbone{at}vanderbilt.edu ![]()
2 The abbreviations used are: NSCLC, non-small cell lung cancer; WFCCM, Weighted Flexible Compound Covariate Method; SAM, Significance Analysis of Microarrays; ACTN4,
-actinin-4. ![]()
3 Supplemental figures and tables are available on our website, http://array.mc.vanderbilt.edu/supplemental and vicc.org/biostatistics/yamagata.ccr./. ![]()
4 Internet address: http://array.mc.vanderbilt.edu/. ![]()
5 Internet address: http://www.ncbi.nllm.nih.gov/blast/http://www.ncbi.nlm.nih.gov/BLAST/. ![]()
Received 1/14/03; revised 6/29/03; accepted 7/ 3/03.
REFERENCES
This article has been cited by other articles:
![]() |
H. Nakatsuji, N. Nishimura, R. Yamamura, H.-o. Kanayama, and T. Sasaki Involvement of Actinin-4 in the Recruitment of JRAB/MICAL-L2 to Cell-Cell Junctions and the Formation of Functional Tight Junctions Mol. Cell. Biol., May 15, 2008; 28(10): 3324 - 3335. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. Retamales, L. Rodriguez, L. Guzman, F. Aguayo, M. Palma, C. Backhouse, J. Argandona, E. Riquelme, and A. Corvalan Analytical Detection of Immunoglobulin Heavy Chain Gene Rearrangements in Gastric Lymphoid Infiltrates by Peak Area Analysis of the Melting Curve in the LightCycler System J. Mol. Diagn., July 1, 2007; 9(3): 351 - 357. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Hara, K. Honda, M. Shitashige, M. Ono, H. Matsuyama, K. Naito, S. Hirohashi, and T. Yamada Mass Spectrometry Analysis of the Native Protein Complex Containing Actinin-4 in Prostate Cancer Cells Mol. Cell. Proteomics, March 1, 2007; 6(3): 479 - 491. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. N. Hayes, S. Monti, G. Parmigiani, C. B. Gilks, K. Naoki, A. Bhattacharjee, M. A. Socinski, C. Perou, and M. Meyerson Gene Expression Profiling Reveals Reproducible Human Lung Adenocarcinoma Subtypes in Multiple Independent Patient Cohorts J. Clin. Oncol., November 1, 2006; 24(31): 5079 - 5090. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. B. Acuff, M. Sinnamon, B. Fingleton, B. Boone, S. E. Levy, X. Chen, A. Pozzi, D. P. Carbone, D. R. Schwartz, K. Moin, et al. Analysis of Host- and Tumor-Derived Proteinases Using a Custom Dual Species Microarray Reveals a Protective Role for Stromal Matrix Metalloproteinase-12 in Non-Small Cell Lung Cancer Cancer Res., August 15, 2006; 66(16): 7968 - 7975. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. M. Jamshedur Rahman, Y. Shyr, P. B. Yildiz, A. L. Gonzalez, H. Li, X. Zhang, P. Chaurand, K. Yanagisawa, B. S. Slovis, R. F. Miller, et al. Proteomic Patterns of Preinvasive Bronchial Lesions Am. J. Respir. Crit. Care Med., December 15, 2005; 172(12): 1556 - 1562. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Hayashida, K. Honda, M. Idogawa, Y. Ino, M. Ono, A. Tsuchida, T. Aoki, S. Hirohashi, and T. Yamada E-Cadherin Regulates the Association between {beta}-Catenin and Actinin-4 Cancer Res., October 1, 2005; 65(19): 8836 - 8845. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. J. Xu, Y. Shyr, X. Liang, L.-j. Ma, E. M. Donnert, J. D. Roberts, X. Zhang, V. Kon, N. J. Brown, R. M. Caprioli, et al. Proteomic Patterns and Prediction of Glomerulosclerosis and Its Mechanisms J. Am. Soc. Nephrol., October 1, 2005; 16(10): 2967 - 2975. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Meyerson and D. Carbone Genomic and Proteomic Profiling of Lung Cancers: Lung Cancer Classification in the Age of Targeted Therapy J. Clin. Oncol., May 10, 2005; 23(14): 3219 - 3226. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. G. Talbot, C. Estilo, E. Maghami, I. S. Sarkaria, D. K. Pham, P. O-charoenrat, N. D. Socci, I. Ngai, D. Carlson, R. Ghossein, et al. Gene Expression Profiling Allows Distinction between Primary and Metastatic Squamous Cell Carcinomas in the Lung Cancer Res., April 15, 2005; 65(8): 3063 - 3071. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. Ferraro, G. Bepler, S. Sharma, A. Cantor, and E. B. Haura EGR1 Predicts PTEN and Survival in Patients With Non-Small-Cell Lung Cancer J. Clin. Oncol., March 20, 2005; 23(9): 1921 - 1926. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. A. Granville and P. A. Dennis An Overview of Lung Cancer Genomics and Proteomics Am. J. Respir. Cell Mol. Biol., March 1, 2005; 32(3): 169 - 176. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| Cancer Research | Clinical Cancer Research |
| Cancer Epidemiology Biomarkers & Prevention | Molecular Cancer Therapeutics |
| Molecular Cancer Research | Cancer Prevention Research |
| Cancer Prevention Journals Portal | Cancer Reviews Online |
| Annual Meeting Education Book | Cell Growth & Differentiation |