Purpose: Gene expression profiling has been shown to be a valuable tool for prognostication and identification of cancer-associated genes in human malignancies. We aimed to identify potential prognostic marker(s) in non-small cell lung cancers using global gene expression profiles.
Experimental Design: Twenty-one previously untreated patients with non-small cell lung cancer were analyzed using the Affymetrix GeneChip high-density oligonucleotide array and comparative genomic hybridization. Identified candidate genes were validated in an independent cohort of 45 patients using quantitative real-time reverse transcription-PCR and Western blot analyses. Follow-up data for these patients was collected and used to assess outcome correlations.
Results: Hierarchical clustering analysis yielded three distinct subgroups based on gene expression profiling. Cluster I consisted of 4 patients with adenocarcinoma and 1 with squamous cell carcinoma (squamous cell carcinoma); clusters II and III consisted of 6 and 10 patients with squamous cell carcinoma, respectively. Outcome analysis was performed on the cluster groups containing solely squamous cell carcinoma, revealing significant differences in disease-specific survival rates. Moreover, patients having a combination of advanced Tumor-Node-Metastasis stage and assigned to the poor prognosis cluster group (cluster II) had significantly poorer outcomes. Comparative genomic hybridization analysis showed recurrent chromosomal losses at 1p, 3p, 17, 19, and 22 and gains/amplifications at 3q, 5p, and 8q, which did not vary significantly between the cluster groups. We internally and externally validated a subset of 11 cluster II (poor prognosis)-specific genes having corresponding chromosomal aberrations identified by comparative genomic hybridization as prognostic markers in an independent cohort of patients with lung squamous cell carcinoma identifying CSNK2A1 and C1-Inh as independent predictors of outcome.
Conclusion: CSNK2A1 and C1-Inh are independent predictors of survival in lung squamous cell carcinoma patients and may be useful as prognostic markers.
Lung cancer and other tobacco-related diseases pose a major worldwide health problem. In the United States, lung cancer is the leading cause of cancer death among both men and women. In 2001, there were ∼169,500 new cases of lung cancer diagnosed, and >157,400 people succumbed to this disease (1) . Based on cytomorphological features, lung cancer is classified into four main histologic variants including squamous cell carcinoma, adenocarcinoma, large cell carcinoma, and small cell lung carcinoma. Squamous cell carcinoma, adenocarcinomas, and large cell carcinoma account for 30%, 30% and 10% of all lung cancers, respectively; they are grouped as non-small cell lung cancer. Despite advances in multidisciplinary therapy, patients with non-small cell lung cancer have poor overall survival, with more than half dying from metastatic disease. The identification of molecular markers predicting poor outcome may facilitate current treatment selection and aid in the development of novel therapies.
High-throughput technologies, such as DNA microarrays and comparative genomic hybridization, represent a powerful means to comprehensively profile and monitor chromosomal aberrations and gene expression in cancer (2) . These technologies not only enhance our understanding of the molecular pathogenesis of disease but also improve the classification of individual cancers. The application of cDNA microarray analysis has been shown to be an important prognostic marker in adenocarcinomas of the lung (3) . However, gene expression profiling typically yields voluminous data, making it difficult to identify individual gene targets having an impact on clinical behavior. Several reports suggest that gene expression changes occurring as a consequence of genomic aberrations are more likely to have biological and clinical relevance (4) . Because even the largest comparative genomic hybridization detected chromosomal aberrations typically effect the expression of a small number of genes, correlation of chromosomal and expression changes may be a rapid method for identifying candidate genes. Accordingly, the simultaneous application of global genomic screening using comparative genomic hybridization and high-throughput gene expression profiling has the potential to identify genes that are deregulated as a result of primary genetic events among the large pool of secondary downstream changes in gene expression (4 , 5) . Clearly, any genes identified by this approach require independent validation. Internal validation is performed using an independent analytical technique (e.g., real-time PCR) to confirm the array findings. External validation is performed on an independent set of cases to validate associations with outcome. Genes remaining candidates after internal and external validation merit additional consideration as prognostic markers.
In the present study, we show that analysis of gene expression patterns in individual tumors by microarray allows segregation of squamous cell carcinoma into subgroups with significant differences in clinical behavior. We identified subsets for genes of which the differential expression was associated with copy number changes identified by comparative genomic hybridization analysis and validated casein kinase II α subunit (CSNK2A1) and C1-inhibitor (C1-Inh) as independent prognostic markers in these tumors.
MATERIALS AND METHODS
Fresh tissue samples were obtained with informed consents from patients undergoing surgical resection for non-small cell lung cancer at Memorial Sloan-Kettering Cancer Center, from February 1990 through July 1997, following guidelines established by the Institutional Review Board. All of the patients were previously untreated and had no detectable metastases to distant organs at presentation. Tumors were staged according to the American Joint Committee on Cancer/Unio Internationale Contra Cancrum Tumor-Node-Metastasis classification 5th edition (6) . Adjuvant treatment was given after radical surgery in appropriate cases following the Center’s protocol. In each case, a portion of tumor was resected near the advancing edge of the tumor, avoiding its necrotic center. After excision, the tissues were immediately snap-frozen and stored in liquid nitrogen until use. The adjacent tissues were submitted for histopathological study to confirm the presence of ≥80% of cancer cells in the procured sample. Histologically normal lung parenchyma, resected at least 5 cm distant from the tumor area, was obtained and used as the baseline control.
DNA Preparation and Comparative Genomic Hybridization Analysis.
Genomic DNA from each sample was extracted using a DNeasy Tissue Kit (Qiagen, Valencia, CA) following the manufacturer’s protocols. Comparative genomic hybridization and image analysis was performed as described previously (7) .
RNA Preparation and Oligonucleotide Microarray Analysis.
Total RNA was extracted from snap-frozen tissue samples with TRIzol reagent (Gibco/BRL, Life Technologies, Inc., Grand Island, NY) following the manufacturer’s protocol and repurified by the RNeasy Mini-spin column (Qiagen). Five to 10 μg of total RNA were reverse transcribed with a cDNA synthesis kit (Life Technologies, Inc., Rockville, MD) in the presence of an oligo dT-T7 primer. After phenol-chloroform extraction and etomidate precipitation, the cDNA pellet was air dried and resuspended in 12 μL of RNase free water. Ten μl were used for the in vitro transcription-amplification reaction in the presence of biotinylated nucleotides (Enzo Diagnostics, Inc., Farmingdale, NY). Fifteen μg of labeled cRNA was fragmented by incubation at 95°C for 35 min in fragmentation buffer [40 mmol/L Tris-acetate (pH 8.1), 100 mmol/L KOAc, and 30 mmol/L MgOAc], and the fragmented cRNA was then hybridized against the Affymetrix HG U95Av2 oligonucleotide arrays (Affymetrix, Santa Clara, CA). The arrays were scanned using a Hewlett Packard confocal laser scanner and analyzed using MicroArray Suite 5.0 (Affymetrix).
Real-time PCR analysis was carried out as described previously (8) . Highly purified salt-free gene-specific primers were designed using the Primer3 program6 and were purchased from Operon Technologies (Alameda, CA). Primers were verified for dimerization using Qiagen Operon Toolkit7 and then confirmed using the National Center for Biotechnology Information BLAST software.8 Reference sequences were obtained from LocusLink.9 Sequences of PCR primer sets (in 5′-3′ direction) were as follows: human homolog of the sea urchin fascin (HSN) FW:TTTCACCCTAGCCTGACTGG, HSN RV: GGACGCCTCCAGCAATAATA, Ki-67 FW:CCAGCAGCAAATCTCAGACA, KI-67 RV: GCAGGTTGCCACTCTTTCTC, CSNK2A1 FW: GAACGCTTTGTCCACAGTGA, CSNK2A1 RV: TATCGCAGCAGTTTGTCCAG, γ-glutamylcysteine synthetase (GCS) FW: ACCATCATCAATGGGAAGGA, GCS RV:TTGGGATCAGTCCAGGAAAC, BTAK FW: GCAGATTTTGGGTGGTCAGT, BTAK RV:ATTTCAGGGGGCAGGTAGTC, neuroleukin (NLK) FW: AGGATGTGATGCCAGAGGTC, NLK RV:TCTTGCCTGTGTACCCCTTC, TTK FW: CGGTTCACTTGGGCATTTAC, TTK RV:CATCTTGTGGTGGCATGTTC, Cyclin E2 FW: TACTGACTGCTGCTGCCTTG, Cyclin E2 RV: ACTGTCCCACTCCAAACCTG, angiogenin (ANG) FW: ACACTTCCTGACCCAGCACT, ANG RV: CCGTCTCCTCATGATGCTTT, receptor-like protein-tyrosine phosphatase μ (PTPRM) FW: GGTGAACATGGTGCAAACAG, PTPRM RV: AGGTGTCCCCACAAAGACAG, C1-Inh FW: CAGCCCTTCTGTTTTCAAGG, C1-Inh RV: GATGCGGGGTAGTGTTAGGA. Sequences of PCR primer sets for 18S rRNA were described previously (9) . Sequences of PCR primer sets for genomic DNA (in 5′-3′ direction) were as follows: CSNK2A1 FW: AGAGGAGGTCCCAACATCATCA, CSNK2A1 RV: CGTCTGGTACAATTGCTTGAA, (-actin FW: AAGATGACCCAGGTGAGTGG, (-actin RV: AACGGCAGAAGAGAGAACCA.
Real-time PCR was performed on a Bio-Rad iCycler with iQ detection system using integrated software (version 3.0a). PCR was undertaken using 2 μg of cDNA or DNA in a 20 μL reaction including 10 μL 2× Bio-Rad iQ SYBR green supermix (Bio-Rad, Hercules, CA) and 1 μL of each 5 μM primer, per reaction. Preliminary experiments were performed with each primer pair and serial dilutions (80, 8, 0.8, 0.08, or 0.008 ng) of reverse-transcribed total RNAs from a head and neck carcinoma cell line MDA1186 or a pooled normal lung tissue sample (MixN) to determine the annealing temperature that yielded the greatest amount of specific product with melting temperature and to calculate the real-time PCR efficiency (E) according to the equation: E = 10[-1/slope] (10) . The relative quantification of a target gene in comparison to a reference (18S rRNA) was performed as described (9 , 10) . The 18S rRNA was used as a reference for input RNA, because it is considered a stable housekeeping gene and was detected at the same level in both tumor and normal tissues by GeneChip array and real-time reverse transcription-PCR (RT-PCR). Reverse transcribed total RNAs (80 ng) from MDA1186 cells (for up-regulated genes) or pooled normal lung tissue (for down-regulated genes) were included in each experiment as a control standard for interexperimental variation, and the amount of mRNA expression in each sample was quantified relative to a control standard. A negative control without cDNA template was run with every assay to assess the overall specificity. Unless otherwise stated, each assay included duplicate reactions for each sample and was repeated twice.
Western Blot Analysis.
Protein extraction and immunoblotting was performed essentially as described (11) . Briefly, 50 μg of protein was lysed in 1× protein lysis buffer (Cell Signaling, Beverly, MA). Samples were run on a 4–20% polyacrylamide gel in 1× tris-glycine-SDS running buffer (Bio-Rad, Hercules, CA) and then transferred to a polyvinylidene difluoride membrane in tris-glycine buffer (Bio-Rad). Blocking was undertaken in 5% nonfat milk in PBS-Tween. Primary antibody incubation was performed at 4°C overnight in blocking solution, as above. Secondary antibody incubation was performed for 1.5 hours at room temperature. Antibodies used were as follows: casein kinase 2 (CK2) KAP-ST010 (Stressgen Biotechnologies, San Diego, CA), B-actin sc-1616 (Santa Cruz Biotechnologies, Santa Cruz, CA), horseradish peroxidase-conjugated antirabbit antibody sc-2054 (SantaCruz Biotechnologies), and horseradish peroxidase-conjugated antigoat antibody sc-2056 (SantaCruz Biotechnologies). Detection was performed using ECL Plus Western Blotting Detection System (Amersham Biosciences, Piscataway, NJ).
All of the correlation and outcome analysis was performed using the JMP statistical software package version 4.0.0 (SAS Institute Inc., Cary, NC). Results were expressed as mean ± SD. For comparisons of the levels of mRNA expression between groups, the two-tailed Mann-Whitney (rank sum) test and the Kruskal-Wallis with Dunn’s multiple comparison test were used when comparing two groups and three groups, respectively. Correlations between the mRNA levels were computed using the two-tailed Spearman nonparametric correlation. The Pearson χ2 test with Fisher’s exact test was used to assess the association of gene expression and clinicopathological parameters. Survival was measured in months from the date of surgery to the date of relapse/death or to the last follow-up. Survival curves and median survival times were calculated by the Kaplan-Meier method, and differences in survivor distribution were calculated by the log-rank test. Cox’s proportional hazards modeling was performed to identify factors with a significant influence on survival while controlling for confounding variables. A P-value < 0.05 was considered statistically significant, and all of the tests were two-tailed. Risk ratios with 95% confidence intervals were estimated for each covariate.
RESULTS AND DISCUSSION
Chromosomal Abnormalities in Non-Small Cell Lung Cancer.
Comparative genomic hybridization was performed in all 21 of the cases in which gene expression profiling was performed. Copy number changes in these cases are detailed in Fig. 1⇓ . Recurrent losses most commonly involved chromosomes 1p (57.1%), 3p (57.1%), 5q (38.1%), 8p (33.3%), 9q (28.6%), 10 (28.6%), 13q (28.6%), 15q (33.3%), 16q (28.6%), 17 (42.9%), 19 (42.9%), and 22(33.3%). Gains most commonly involved 3q (47.6%), 5p (66.7%), 8q (38.1%), and 19 (19.0%). High level amplifications were identified at 3q26–27, 5p, 7q21, 8q23, 11q12–13, 12p11, 12q13, 13q14, 16q22–24, 17q23, 18q11, 19q11, and 22q11. This pattern of chromosomal aberrations is similar to that reported for non-small cell lung cancer in the literature. A notable individual difference was gain of 3q, which occurred exclusively in the squamous cell carcinomas and not adenocarcinomas. No correlations between comparative genomic hybridization findings and clinical outcome were identified. However, for the purposes of this study we restricted the utilization of comparative genomic hybridization data to correlations with gene expression analysis.
Gene Expression Patterns in Non-Small Cell Lung Cancer.
We analyzed gene expression profiles in 21 patients with non-small cell lung cancer by comparison between primary tumor samples and a pool of morphologically normal lung tissues (from 10 cases). Among 12,625 probe sets in the Affymetrix array, there were 95 probe sets changed in a majority of the cases (20 of 21). There were 56 probe sets representing 41 genes and 12 that significantly altered (≥3-fold) in all 21 of the cases. Six genes and 1 expressed sequence tag were increased, and 35 genes and 11 expressed sequence tags were decreased in all of the tumors, compared with normal controls. Table 1⇓ lists the genes grouped into biological pathways known to be relevant in oncogenesis such as cell proliferation, apoptosis, development, and invasion/metastasis. Among 6 overexpressed genes, UBCH10 encoding cyclin-selective ubiquitin carrier protein or ubiquitin-conjugating enzyme showed the highest level of increase (average, 41.6-fold). UBCH10 plays a crucial role in cell-cycle regulation from mitosis into G1 phase. Other genes that are involved in cell cycle control and intracellular transport events are mitotic centromere-associated kinesin (MCAK) and HMGIY. MCAK (also called kinesin-like 6) is highly expressed in tissues containing dividing cells and is not detected in normal lung tissue (12) , whereas HMGIY encodes high mobility group protein isoforms I and Y and mediates a network of protein-DNA and protein-protein interactions. HMGIY has been shown to be a potential oncogene that works in conjunction with c-Myc (13) . MMP9 is an endopeptidase that digests basement membrane type IV collagen, and enhanced MMP9 expression has been related to malignant progression of several tumor types including non-small cell lung cancer (14) . Osteopontin (OPN) is a phosphorylated glycoprotein involved in cancer development and progression, possibly via Ras activation. Recently, OPN was shown to be highly expressed in non-small cell lung cancer tissues and cell lines but not in small cell lung cancer (15) .
Several genes were found to be significantly underexpressed or absent in tumor compared with normal lung tissues (Table 1)⇓ . These may include tumor suppressor genes of which the expression was reduced by chromosomal loss during tumorigenesis. RAGE had an average reduction of 64.4-fold compared with normal controls, making it the gene with the highest degree of suppression. Moreover, recurrent loss of the 14q32 locus was identified by comparative genomic hybridization analysis (Fig. 1)⇓ in tumors with loss of gene expression. RAGE encodes for a receptor for advanced glycosylation end products and is a multiligand member of the Immunoglobulin superfamily of cell surface molecules implicated in homeostasis and cell differentiation. Absent or markedly reduced expression of RAGE, both at transcriptional and protein levels, was demonstrated in the non-small cell lung cancer tissue, suggesting that a down-regulation of the receptor (as observed in the present study) may be a critical step in lung tumor formation (16) .
Promyelocytic leukemia zinc finger (PLZF) is a DNA-binding transcriptional repressor with its disruption resulting in leukemogenesis (17) . PLZF plays a key role in cell proliferation and functions as a proapoptotic factor. The present findings of PLZF underexpression in lung tumors may lend a selective growth and survival advantage to the tumor cells. Another apoptotic regulator found to be down-regulated with recurrent loss at its chromosomal locus (19p13.3) in this study is GADD45β, encoding growth arrest and DNA damage-inducible protein (18 , 19) . PLA2G2A encoding the secretory type II phospholipase A2 is a potential tumor suppressor gene in human colorectal tumorigenesis (20) . PLA2G2A gene representing the Mom1 (Modifier of Min-1) locus is located on chromosome 1p35–36.1 (21) , which is frequently lost based in our comparative genomic hybridization data. Loss of heterozygosity and absent or very low expression levels of PLA2G2A are common features of colorectal cancer cell lines, whereas normal colonic mucosa usually shows expression (22) . The present finding of PLA2G2A underexpression in lung tumor tissues is novel and suggests the role for PLA2G2A in this cancer type. Another common chromosomal loss region based on our comparative genomic hybridization data are 3p, where two under-expressed genes including tetranectin (TNA; 3p22-p21.3) and α2δ calcium channel subunit I (3p21) are located. Tetranectin is a plasminogen-binding protein that is induced during the process of osteogenesis (23) . The plasma concentration of tetranectin is reduced in patients with various malignancies, and its role in cancer progression has been suggested (24) . The α2δ calcium channel subunit I gene has also been suggested as a putative tumor suppressor gene in lung cancer (25) . The detection of cellular genes consistently altered in 21 different cases of human non-small cell lung cancer will lead to additional experiments to determine their usefulness as classifiers to predict the malignant nature of lung tissues. The biology associated with these genes could be explored to evaluate their role in non-small cell lung cancer development.
Hierarchical Clustering Analyses.
We grouped patients according to the pattern of gene expression based on hierarchical clustering. The raw microarray image data were processed using the Affymetrix GeneChip software to compute the relative expression levels for each sample based on a pool of normal tissues. The pool of normal samples was used to compute a normal to tumor fold-change, in effect normalizing the tumor samples by the average of the normal pool. Using the normal samples in this way “drops” them from the subsequent analyses. The fold changes were first filtered to remove any gene that did not have at least one sample with a fold change >2. The remaining genes were then used to cluster the samples. We use the log of the fold change and normalized each sample vector to unit length. The log of the fold change was then clustered using hierarchical clustering with the Ward linkage method and the standard Euclidean distance metric (26) . The tumor sample dendrogram, summarizing the degree of similarity in gene expression among the 21 samples, is shown in Fig. 2⇓ . Lung tumor samples can be subclassified into three distinct groups. Cluster I consisted of 4 adenocarcinoma and 1 squamous cell carcinoma, whereas clusters II and III consisted of 6 and 10 patients, with squamous cell carcinoma, respectively (Table 2)⇓ . It is interesting to note that all 4 of the adenocarcinoma samples were grouped together in cluster I, indicating that these tumors share a common expression pattern. Significant positive correlation was observed between the cluster I and adenocarcinoma cell type (P < 0.0004; r2 = 0.7553). The reason why 1 squamous cell carcinoma sample was clustered into that group is not clear. The clinicopathological data for each of the cluster groupings defined by gene expression profiling is given in Table 2⇓ . Except for histopathological category, none of the other parameters examined showed significant differences between the cluster groups, indicating that gene expression profiling can segregate tumor samples independently from other clinicopathological parameters. Analysis of comparative genomic hybridization data showed that there were minimal differences in copy number changes between cases in cluster groups identified by gene expression profiling (Fig. 1)⇓ .
Survival of Patients with Different Squamous Cell Carcinoma Subgroups.
The subdivision of squamous cell carcinoma based on gene expression patterns raised the possibility that clinical outcomes may be different for the two squamous cell carcinoma subgroups (cluster II and III), as had been shown in diffuse large B-cell lymphoma (27) and lung adenocarcinoma (28) . The Kaplan-Meier analysis with log-rank test was performed on the two cluster groups revealing an independent impact of cluster II on disease-specific survival (P = 0.0423) (Fig. 3)⇓ . Moreover, patients having a combination of advanced Tumor-Node-Metastasis stage (II-IV) and assigned to the poor prognosis cluster group based on gene expression profiling had significantly poorer outcomes (median relapse-free survival = 6.9 versus 55.5 months, P < 0.0001; median disease-specific survival = 7 versus 47 months, P < 0.0001 and median overall survival = 7 versus 38.4 months, P < 0.0001). This implies that gene expression profiles can be used to subdivide lung squamous cell carcinoma into subgroups that are clinically different on the basis of survival.
Expression of Individual Genes Correlated with Poor Prognosis Cluster.
The gene expression profile for cluster II was of particular interest, because these tumors showed poorer prognosis, suggesting that analysis of individual gene differences may provide insight into aggressive tumor behavior. We first compiled a list of genes that were differentially expressed and best distinguished the three groups identified by gene expression profiling. The following criteria were used: increased or decreased expression relative to normal lung tissues and minimal variation in expression within each cluster. Fig. 4⇓ ⇓ lists a subset of genes that were strongly overexpressed (Fig. 4A)⇓ ⇓ or underexpressed (Fig. 4B)⇓ ⇓ in cluster II but not in clusters I or III. These include genes associated with metastatic/invasion, transcription factors, oncogenes/tumor suppressor genes, and differentiation markers.
Given that few genes show changes in their expression in regions of chromosomal aberration identified by comparative genomic hybridization, we focused our analysis on deregulated genes within chromosomal loci having copy number aberrations identified by comparative genomic hybridization analysis. To increase our identification of “relevant” genes, we additionally restricted our analyses to genes associated with cancer pathogenesis based on review of available information. We selected 8 “up-regulated” and 3 “down-regulated” genes, located in chromosomal loci involved in recurrent chromosomal gains and losses identified by comparative genomic hybridization, for additional analysis using quantitative real-time RT-PCR. These included genes encoding actin bundling protein or HSN (29) , CSNK2A1 (30) , GCS (31) , serine/threonine kinase BTAK (32) , Ki-67 (33) , human kinase TTK (34) , cyclin E2 (35 , 36) , NLK (37) , ANG (38) , PTPRM (39) , and C1-Inh (40) .
All 8 of the overexpressed genes have been suggested to play a role in malignant progression. CSNK2A1 encodes the catalytic subunit α of protein kinase CK2 involved in cell cycle progression. CSNK2A1 cooperates with Ha-ras by increasing the growth rate of transformed fibroblasts (41) , and elevated CK2 activity is associated with malignant transformation of several tissue types including lung (42) . CK2 overexpression may contribute to tumorigenesis via regulation of the Wnt/β-catenin pathway with c-myc as a downstream target (43) . GCS is the rate-limiting enzyme in glutathione synthesis. It plays a crucial role in homeostasis of normal cells and in the growth and resistance of malignant cells to chemotherapy (44) . Increased expression of γ-GCS in non-small cell lung cancer, predominantly squamous cell carcinoma cell type with the strong association with Bcl-2, an antiapoptotic protein, has been reported recently (45) . Cyclin E2 and cyclin E1 (formerly called cyclin E) belong to the family of E-type cyclins that control the G1-S-phase progression in mammalian cells. Cyclin E1 has been shown to be an independent prognostic marker in non-small cell lung cancer (46) . Unlike cyclin E1, which is expressed in most proliferating normal and tumor cells, cyclin E2 levels were very low in nontransformed cells and increased significantly in tumor cells (46) . Increased cyclin E2 expression has been reported in non-small cell lung cancer and small cell lung cancer cell lines, but its role in lung cancer in vivo is not yet known. Ki67 is a cell cycle-associated nuclear nonhistone protein associated with cell proliferation. Immunohistochemical studies in non-small cell lung cancer revealed that high Ki-67 expression was significantly associated with squamous cell carcinoma histology and that high Ki-67 was an unfavorable independent prognostic factor (47) . NLK, also called phosphohexose isomerase, is a cytokine that regulates cell motility in vitro as well as invasion and metastasis in vivo (48 , 49) . The autocrine motility factor receptor has been demonstrated to be an independent prognosticator in various types of tumors including non-small cell lung cancer (49) . Elevated expression of HSN contributes to aggressive cell behaviors by modulating actin structures and motility in transformed cells and c-erbB-2 overexpressing breast cancer cell lines (50 , 51) . BTAK (also known as STK15 or aurora2) encoding a centrosome-associated kinase, is amplified and overexpressed in multiple tumor cell types and is involved in the induction of centrosome duplication-distribution abnormalities and aneuploidy (52) . TTK is a protein kinase that plays a role in cell cycle control (53) . TTK is detectable in most malignant tissues and normal tissues, which contain a large number of proliferating cells, but is not detected in most benign tissues (53) .
Among the down-regulated genes, ANG was the first tumor-derived angiogenic factor identified. However, subsequent studies revealed that ANG was expressed in both normal epithelial cells and neoplastic cells (54) , and the lowest expression was observed in early development. ANG, also called RNASE5, has been shown to play a role in RNA catabolism and turnover. Interestingly, RNase 4 from the same family of ribonucleases, like ANG, is encoded by RNASE4 on chromosome 14, which shows a loss in 20% of cases based on comparative genomic hybridization analysis. Both ANG and RNASE4 are in the down-regulated cluster II gene lists (Fig. 4B)⇓ ⇓ . Recent studies in breast (54) and head and neck squamous cell carcinoma (55) showed that low levels of ANG were associated with poor outcome. The underlying mechanism of this underexpression in lung cancer is currently unknown. PTPRM encodes an IgG superfamily transmembrane receptor protein-tyrosine phosphatase with a role in cell-cell recognition, cell-cell adhesion, and interaction with the cadherin-catenin complex (39) . Another member of receptor protein-tyrosine phosphatases, namely PTPRG, has been suggested recently as a candidate tumor suppressor gene in lung cancer (56) , but the role of PTPTM is not yet known. C1-Inh, a member of the serpin superfamily of protease inhibitors, is an important regulator of inflammatory reactions and the intrinsic pathway of coagulation (57) . Hereditary C1-Inh deficiency is associated with angioedema, and the plasma levels are decreased in fatal sepsis, severe burn, and capillary leakage syndrome. The plasma C1-Inh-specific activity in cancer patients was found to be significantly lower than that in normal controls, although its role in malignancy is unclear (58) .
We used a two-step quantitative RT-PCR using SYBR Green I dye detection with product verification by melting curve analysis to validate expression changes identified by gene array analysis for selected genes in all 21 of the cases. Real-time PCR evaluates product accumulation during the log-linear phase of the reaction and is currently the most accurate and reproducible approach to gene quantification (59 , 60) . We first determined whether the results obtained from microarray hybridizations agreed with the RT-PCR findings. Comparison of the fold changes from microarray and RT-PCR data revealed that the overall trend and correlation are similar (P = 0.0002–0.0239; Spearman r = 0.4335–0.6668, data not shown). The RT-PCR data confirmed the up-regulation and down-regulation of selected candidates in the microarray measurements. Genes with strong hybridization signals and at least 2-fold differences were more likely to be validated by real-time RT-PCR as reported previously (60) .
Outcome Analysis of Clinicopathological Factors and Cluster II Gene Expression.
The biological significance of the cluster II-specific genes was validated in an independent cohort of patients consisting of 45 lung cancers (Table 3)⇓ . Using real-time semiquantitative RT-PCR, the mean expression levels of HSN, CSNK2A1, GCS, BTAK, Ki-67, TTK, cyclin E2, and NLK in lung tumors were found to be significantly greater (1333.5, 3.984, 123.6, 2.93, 2.178, 121.6, 4.1, and 2.83-fold; P = 0.0375, 0.032, 0.0069, 0.037, 0.048, 0.0001, 0.0055, and 0.05, respectively) than those in histologically normal lung tissues. In contrast, expression of ANG, PTPTM, and C1-Inh in lung tumors was significantly lower (91.4, 53.5, and 134.1-fold; P = 0.05, 0.0258, and 0.0042, respectively) than the expression in control tissues. We defined the cutoff value for differentiation between aberrant and normal expression as the mean ±2SD of the corresponding mRNA levels in normal lung tissues (HSN = 0.128 ± 1.28, CSNK2A1 = 2745.68 ± 437.119, GCS = 73.735 ± 26.416, BTAK = 183.456 ± 75.799, Ki-67 = 5839.684 ± 1788.099, TTK = 28.396 ± 51.646, cyclin E2 = 3392.771 ± 1503.282, NLK = 5180.875 ± 2870.617, ANG = 176041.567 ± 64168.484, PTPTM = 22565.207 ± 13104.734, and C1-Inh = 59765.058 ± 27079.621). Using these criteria, HSN, CSNK2A1, GCS, BTAK, Ki-67, TTK, cyclin E2, and NLK, respectively, were overexpressed in 48.5%, 21.2%, 75.7%, 42.4%, 21.2%, 66.7%, 54.5%, and 27.3% of lung squamous cell carcinoma, and ANG, PTPTM, and C1-Inh, respectively, were underexpressed in 84.8%, 66.7%, and 54.5% of lung squamous cell carcinoma.
We assessed the prognostic significance of various clinicopathological parameters and expression of selected genes from cluster II in 33 patients with lung squamous cell carcinoma by the Kaplan-Meier plots. With the median follow-up for living patients of 93.4 months, the median relapse-free survival, disease-specific survival, and overall survival were 57, 75.5, and 38.4 months, respectively. Univariate analysis (log-rank test of Kaplan-Meier survival curves) demonstrated that overexpression of CSNK2A1 and underexpression of C1-Inh were associated with unfavorable prognosis (Table 4)⇓ . Only factors that showed P-value < 0.2 in univariate analysis including age, gender, BTAK, CSNK2A1, angiogenin, and C1-inh were evaluated by multivariate analysis with Cox’s proportional hazards. Both CSNK2A1 and C1-Inh remained independent prognosticators in this group of patients (Table 5)⇓ . The other cluster II-genes analyzed did not have a significant association with outcome. However, this may be due to the relatively small cohort of patients examined in the present study.
Finally, the correlation between DNA amplification, mRNA level, and protein expression of CK2 by carcinoma cells was confirmed in representative non-small cell lung cancer cell lines and tissues. Five non-small cell lung cancer cell lines (H157, H322, A549, H1299, and H2030) and four patient squamous cell carcinoma tumor specimens (L9, L22, L4, and L20) were chosen for analysis. Genomic DNA amplification of CSNK2A1 was determined by real-time PCR and normalized to β-actin DNA copy number. The same cell lines and tumor samples were used for protein analysis by Western blotting. Because the cell lines are pure populations of tumor cells, gene expression in these lines cannot come from stromal tissue. Accordingly they reflect changes specific to the tumor. The levels of protein expression in primary tissues corresponded with amplification status by comparative genomic hybridization and high-level mRNA expression by real-time PCR (Fig. 5)⇓ .
In summary, this is the first study to use large-scale transcriptional profiling and screening for genomic aberrations to predict survival outcome and to identify gene targets in squamous cell carcinoma of the lung. The use of high-density oligonucleotide probe arrays to identify gene expression differences between non-small cell lung cancer and normal lung tissues provides a powerful means to decode the molecular events involved in the genesis and progression of non-small cell lung cancer. We identified distinct subtypes of non-small cell lung cancer with significant differences in outcome based on differences in gene expression patterns in morphologically indistinguishable squamous cell carcinoma, as reported in other tumor types (28 , 61) . Moreover, the combination of comparative genomic hybridization and gene expression profiling is useful for identifying individual genes associated with aggressive behavior, including CSNK2A1 and C1-inh that had independent prognostic significance in non-small cell lung cancer. Although these initial findings will need to be validated in relationship to clinical parameters and outcome in larger patient cohorts, the characterization of genes identified to be significant prognostic predictors by oligonucleotide microarray analysis may provide novel targets for prognostication and treatment of non-small cell lung cancer.
We thank Nancy Bennett for her excellent editorial assistance and Liliana Villafania for technical support.
Grant support: Martel Foundation (B. Singh and V. Rusch). N. Socci is supported by the National Institute of Neurological Disorders and Stroke NIH Grant (NS39662). P. O-charoenrat is a recipient of the American Cancer Society-Unio Internationale Contra Cancrum International Fellowship for Beginning Investigators.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
Requests for reprints: Bhuvanesh Singh, Head and Neck Service, Memorial Sloan-Kettering Cancer Center, 1275 York Avenue, New York, NY 10021.
↵6 Internet address: http://www.genome.wi.mit.edu/cgi-bin/primer/primer3.cgi.
↵7 Internet address: http://oligos.qiagen.com/oligos/toolkit.php.
↵8 Internet address: http://www.ncbi.nlm.nih.gov/BLAST/.
↵9 Internet address: http://www.ncbi.nlm.nih.gov/LocusLink/.
- Received September 29, 2003.
- Revision received May 13, 2004.
- Accepted May 19, 2004.