The majority of lung cancers are caused by long term exposure to the several classes of carcinogens present in tobacco smoke. Although a significant fraction of lung cancers in never smokers may also be attributable to tobacco, many such cancers arise in the absence of detectable tobacco exposure, and may follow a very different cellular and molecular pathway of malignant transformation. Recent studies summarized here suggest that lung cancers arising in never smokers have a distinct natural history, profile of oncogenic mutations, and response to targeted therapy. The majority of molecular analyses of lung cancer have focused on genetic profiling of pathways responsible for metabolism of primary tobacco carcinogens. Limited research has been conducted evaluating familial aggregation and genetic linkage of lung cancer, particularly among never smokers in whom such associations might be expected to be strongest. Data emerging over the past several years show that lung cancers in never smokers are much more likely to carry activating mutations of the epidermal growth factor receptor (EGFR), a key oncogenic factor and direct therapeutic target of several newer anticancer drugs. EGFR mutant lung cancers may represent a distinct class of lung cancers, enriched in the never-smoking population, and less clearly linked to direct tobacco carcinogenesis. These insights followed initial testing and demonstration of efficacy of EGFR-targeted drugs. Focused analysis of molecular carcinogenesis in lung cancers in never smokers is needed, and may provide additional biologic insight with therapeutic implications for lung cancers in both ever smokers and never smokers. (Clin Cancer Res 2009;15(18):5646–61)
Natural History and Prognosis
The preceding articles in this issue of CCR Focus present an overview, and a description of clinical epidemiology and risk factors associated with lung cancer in never smokers (1, 2). This article is intended to summarize the current status of molecular profiling of lung cancer in never smokers, to indicate how profiles differ between lung cancer in ever smokers and never smokers, and to review the therapeutic implications of these molecular characteristics. To place the therapeutic implications in context, this section will first summarize recent studies of differential clinical outcomes in lung cancer patients by ever smoker versus never smoker status, irrespective of therapies targeting particular molecular determinants.
Four recent retrospective analyses have compared the characteristics and treatment outcomes of never smokers and smokers with lung cancer across stages of disease and regardless of modality of treatment (3–6). All of these series report on data obtained prior to widespread use of epidermal growth factor receptor (EGFR) inhibitors or other targeted therapies. Together these studies suggest that lung cancer in never smokers has peak incidence at a younger age than in smokers, is more likely to arise in women, and is more likely to be of adenocarcinoma histology. Furthermore, these studies show a survival advantage for never smokers compared with former and current smokers. These data are summarized in Table 1.
Age of Onset
Two studies from Singapore considering lung cancer across histologic types suggest that cancer in never smokers occurs at a younger age of onset (P < 0.001; refs. 3, 5). These data are further supported by an epidemiologic study in a Caucasian population (7). Etzel and colleagues report a higher proportion of never smokers (23.9%) among 230 cases of early onset lung cancer (≤50 years of age) than among 426 cases diagnosed at ≥70 years of age (17.6%; P < 0.001). However, a fourth study limited only to patients with adenocarcinoma (4) reports the opposite finding: median age of onset 63.5 years for never smokers versus 59.4 years for smokers (P = 0.0005).
Data from both Asia and the United States consistently report a higher proportion of women among never smokers with lung cancer relative to smokers with lung cancer. The study by Toh and colleagues found among a predominantly ethnic Chinese population in Singapore that more than 68% of the never smokers with lung cancer were women, compared with 12% of current, and 13% of former smokers (P < 0.001; ref. 5). Nordquist and colleagues reported that women comprised 78% of the never-smoker cohort, compared with 54% of the smokers in their series limited to patients with adenocarcinoma (P < 0.0001; ref. 4).
The Singapore series of Toh and colleagues was the only analysis that specifically focused on the distribution of histologic types among lung cancers arising in never, current, and former smokers (8). In this series adenocarcinomas comprised 69.9% of lung cancers in never smokers, versus 39.9% in current, and 47.3% in former smokers (P < 0.001). Conversely, squamous cell carcinoma comprised 5.9% of lung cancer in never smokers, versus 35.7% in current, and 28.0% in former smokers.
Four of the retrospective series include multivariate analysis evaluating outcome in never smokers versus ever smokers with lung cancer across stages. These studies consistently report a consistent hazard ratio (HR) of approximately 1.3, favoring never smokers (Table 1).
Postsurgical outcome in never smokers
Two studies evaluated outcome among patients with surgically resected stage I non-small cell lung cancer (NSCLC; refs. 9, 10). Fujisawa reported 10-year overall and disease-specific survival of 84.9% and 88.2%, respectively, among individuals with 0 pack-years (n = 118), compared with 77.3% and 77.3% for smokers with 1 to 29 pack-years (n = 39), and 36.7% and 64.7% for smokers with ≥30 pack-years (n = 212). P-value for overall survival was <0.001 comparing 0 to ≥1 pack-years, and comparing <30 to ≥30 pack-years. Patients with a history of smoking had higher rates of death both from recurrent disease (P = 0.0004) and from nonmalignant causes (P = 0.026). However, survival was not significantly associated with smoking in a multivariate analysis including age, T stage, pleural invasion, and gender. The definition of the 0 pack-year cohort in this study is not provided; it is unclear to what extent this cohort represents lifelong never smokers. Similarly, Yoshino and colleagues evaluated outcome in 428 patients with resected stage I lung adenocarcinomas (10). Never smokers (n = 193) had improved postoperative survival over 5 years relative to ever smokers (P = 0.0001). In this study, ever-smoking status remained an independent prognostic factor in multivariate analysis including gender, histologic subtype, and pathologic substage.
Evaluating surgically resectable NSCLC more broadly, Nia and colleagues did a retrospective postsurgical analysis on outcome of 311 patients with resected stage I to IIIB disease operated on by a single surgeon (11). The study excluded patients with perioperative death (death within 30 days of surgery, or during the same hospitalization as surgery). Populations analyzed included active smokers (54.3%), recent quitters (11.3%), former smokers (11.3%), and never smokers (8%; n = 25). Current active smokers were considered as the reference group. Postsurgical survival was significantly better in never smokers than active smokers [relative risk = 0.45; 95% confidence interval (CI) 0.21-0.97; P = 0.042]. However, similarly improved outcome was seen among former smokers (relative risk 0.54, 95% CI 0.35-0.84; P = 0.006) and recent quitters (relative risk 0.34; 95% CI 0.16-0.71; P = 0.004), suggesting that a primary difference in survival was attributable to smoking status at the time of surgery.
Outcome in the absence of active treatment
No studies to date have been designed specifically to compare the outcome of never smokers versus ever smokers with advanced lung cancer in the absence of active therapy. However, two phase III randomized placebo-controlled trials of single agent EGFR inhibitor therapy in advanced recurrent disease-the ISEL study with gefitinib (12) and the BR.21 study with erlotinib (13)-collected data from the placebo control arms that are available for retrospective analysis. Never smokers in the ISEL study treated with placebo had a median survival of 6.1 months and a 1-year survival of 29% versus 4.9 months and an 18% 1-year survival for ever smokers. In the BR.21 study, the median survivals in the placebo group for never smokers and ever smokers were 5.6 months and 4.6 months, respectively. Although data from the placebo control arms in both studies suggest a trend toward improved outcome in never smokers, in neither case do these differences reach statistical significance, in contrast to the outcomes on the investigational arms of these studies (see below).
Chemotherapy treatment outcome in never smokers
Never smokers with lung cancer typically have not been considered as a separate clinical subgroup in trials of cytotoxic chemotherapies. Landmark trials used to set standard first-line (14) or second-line treatments (i.e., docetaxel and pemetrexed; refs. 15, 16) for patients with advanced and/or metastatic NSCLC did not report the smoking history of patients studied. Even the definitive study that showed that the angiogenesis inhibitor bevacizumab confers a survival advantage when added to standard doublet chemotherapy in the first-line over chemotherapy alone (17), did not report data on smoking histories. Thus, there are few prospective data about response or survival rates in never smokers treated with conventional systemic therapies.
There are two retrospective studies on the outcomes of never smokers who received standard cytotoxic chemotherapy, and they present conflicting results. In one study, Tsao and colleagues studied response, progression-free survival, and overall survival in 873 evaluable patients with stage III or IV NSCLC who received first-line chemotherapy at a single U.S. institution from 1993 to 2002 (18). Never smokers (n = 137) had higher response rates than former or current smokers (19% versus 8% versus 12%, respectively; P = 0.004), and improved overall survival (P < 0.0001). Never-smoking status remained an independent predictor in multivariate analysis including adjustment for age, gender, stage, and performance status, with an HR of 1.47 for former smokers (P = 0.003) and an HR of 1.55 for current smokers (P = 0.0004). These data contrast with those found by Toh and colleagues, who studied 143 patients with stage IIIB/IV NSCLC who received first-line chemotherapy at a single Singaporean institution from 1999 to 2002 (8). A total of 41% (59) of patients were “never” smokers, whereas 59% (84) were smokers. They found no significant differences between never smokers and ever smokers in response, duration of response, or survival, even after adjusting for known prognostic factors.
At least four factors could account for the observed differences between the two studies. First, the investigators studied different ethnic populations (United States versus Singapore). Second, there were fewer patients in the Singaporean trial, which may have been underpowered to detect differences in outcome between never and ever smokers. Third, different types of chemotherapy were given to the U.S. and Singaporean patients. In the U.S. study, 78% of patients received platinum-based regimens, whereas in the Singapore study, only 59% of patients received platinum-based agents. Lastly, definitions of the populations studied slightly differed in the two analyses. The Singapore study defined never smokers as individuals “who had never smoked or smoked too little in the past to be regarded as an ex-smoker,” whereas the U.S. study defined never smokers as individuals who smoked <100 total lifetime cigarettes.
EGFR, KRAS, and Treatment with EGFR Inhibitors
Interest in defining the characteristics of lung cancer in never smokers over the past several years has been driven in large part by a striking observation in clinical trials of the EGFR small molecule tyrosine kinase inhibitors gefitinib and erlotinib: lung cancer patients with minimal or no history of tobacco use showed markedly better clinical outcome when treated with these agents. Many retrospective studies of patients (from the United States, Europe, and East Asia) have shown that compared with ever smokers, never smokers show preferential clinical benefit from treatment with an EGFR inhibitor as upfront or salvage treatment, including statistically significant higher responses rates, longer times to progression, and/or longer median overall survivals (Table 2). Additional intriguing clinical observations included that responses were higher in patients of East Asian ancestry, in women, and in tumors with adenocarcinoma histology (19).
Intensive molecular research by multiple groups seeking to explain these evident population differences ultimately led to three key publications in 2004, reporting that mutations affecting the ATP binding site of the tyrosine kinase domain of EGFR are strongly associated with response to EGFR tyrosine kinase inhibitors (20–22). Mutations in EGFR have become a primary focus of research in lung cancer. A number of EGFR point mutations and deletions affect the ATP binding site of the receptor and may lead to a constitutively active and ligand-independent receptor state. The most common activating mutations in the kinase domain of EGFR are exon 19 deletions, which eliminate a leucine-arginine-glutamate-alanine (LREA) motif, and point mutations at position 858 in exon 21, resulting in substitution of arginine for leucine (L858R). Tumors with activating mutations in EGFR are highly dependent on continued EGFR signaling for proliferation and survival. Activating mutations in EGFR occur more commonly in never smokers, in women, in patients of East Asian ethnicity, and in adenocarcinomas.
These data have been the subject of multiple recent retrospective studies; in one notable example, Shigematsu and Gazdar reported on analysis of more than 2,000 NSCLCs (23). EGFR mutations were found to be more common in adenocarcinoma (30%; 413 of 1,380 cases) than in lung cancers of other histologies (2%; 16 of 993 cases), and more common in lung cancer from never smokers (45%) than from ever smokers (7%). There was a significant inverse correlation between smoking status and frequency of EGFR mutations (Fig. 1).
EGFR targeted therapy
Several prospective studies have validated that never smokers preferentially benefit from EGFR inhibitor monotherapy (either gefitinib or erlotinib) as compared with ever smokers. Among these, two were prospective clinical trials but retrospectively analyzed (24) for smoking association [i.e., IDEAL-1 (gefitinib); ref. 25] and IDEAL-2 (gefitinib; ref. 26), and two were prospective randomized placebo-controlled trials for second line therapy with preplanned subgroup analyses [i.e., ISEL (gefitinib); ref. 12] and BR.21 (erlotinib; Table 2; ref. 13). In the ISEL study with gefitinib, never smokers had a response rate of 18.1% versus 5.3% for ever smokers. The HR for survival for never smokers was 0.67 (P = 0.012). In the BR.21 study with erlotinib, the response rate for never smokers was 24.7% versus 3.9% in ever smokers, and the HR for survival was 0.4 (P = 0.02). These studies show that for patients with advanced NSCLC previously treated with chemotherapy, EGFR inhibitor monotherapy leads to greater benefit, in terms of improved response rate and prolonged survival, in never smokers compared with ever smokers. A treatment effect is further supported by the lack of a statistical difference in survival seen between never smokers and ever smokers in the placebo arms of the respective trials.
EGFR inhibitors may also be beneficial to never smokers in the first-line setting. In a prospective trial of gefitinib as first-line treatment in 36 never smokers, a 69% response rate was observed, with a median time-to-progression of 8 months (27). The median overall survival had not been reached at the time of publication. Recently, results were reported from a phase III randomized, open-label, first-line study of gefitinib versus carboplatin and paclitaxel in “never” or “light” East Asian smokers with advanced NSCLC (i.e., the IRESSA Pan Asia Study or “IPASS”). Patients with EGFR mutations experienced longer progression-free survival with gefitinib, and those without mutations had longer progression-free survival with chemotherapy [EGFR mutation positive, HR 0.48; 95% CI 0.36-0.64; P < 0.001 (favors gefitinib); EGFR mutation negative, HR 2.85; 95% CI 2.05-3.98; P < 0.001 (favors chemotherapy); ref. 28]. Final overall survival data have not yet been reported.
Taken together these several studies clearly show that never smokers benefit from treatment with EGFR inhibitors relative to ever smokers, and that this association is in large part attributable to the markedly higher rates of activating EGFR mutations in never smokers. Despite this important and consistent result, a standard treatment algorithm for never smokers with lung cancer remains to be established. Furthermore, although the addition of EGFR inhibitor to chemotherapy does not confer a survival advantage in unselected populations of NSCLC in four major studies (INTACT-2, ref. 29; INTACT-1, ref. 30; TRIBUTE, ref. 31; TALENT, ref. 32), a retrospective analysis of never smokers in one of the studies (TRIBUTE; ref. 31) showed that never smokers treated with the combination of erlotinib plus chemotherapy lived longer than those who received chemotherapy alone (22.5 versus 10.1 months; P = 0.01). To follow up on this observation, the Cancer and Leukemia Group B (CALGB) is conducting a trial for untreated never smokers, randomizing patients to erlotinib alone or erlotinib plus concurrent chemotherapy. Because of the limited numbers of never smokers, completion of this trial is not expected for several years.
The T790M mutation in EGFR, associated with acquired resistance to gefitinib and erlotinib, has been reported as a rare somatic germline mutation associated with genetic susceptibility to lung cancer in a family with multiple lung adenocarcinomas and bronchoalveolar cancers (33). The smoking status of only one family member was reported; he had a history of smoking. In contrast to multiple prior reports, recent data suggest that this alteration in the catalytic domain of EGFR might result in increased kinase activity and confer a proliferative advantage to cells relative to wild-type EGFR (34).
Oncogenic mutations in KRAS are found in approximately 20% of lung adenocarcinomas, and are rare in other histologic subtypes. A meta-analysis evaluating the prognostic significance of KRAS mutations in lung adenocarcinoma concluded that KRAS mutation is a negative prognostic factor, with an overall HR for death of 1.35 (1.16-1.56; ref. 35). KRAS and EGFR mutations, although both found primarily in adenocarcinomas, are essentially never found together in the same tumor, suggesting that these mutations may serve as alternative mechanisms for activating an overlapping set of oncogenic pathways (36–41). The EGFR cell surface receptor activates multiple downstream signaling pathways controlling cell proliferation and cell survival, including the Ras-Raf-ERK and PI3K-Akt pathways. A constitutively active mutant KRAS seems to obviate the need for upstream signaling from EGFR in activating the ERK pathway, and in many cells, PI3K-Akt as well.
Several investigators specifically evaluated whether KRAS mutations in lung adenocarcinomas correlate with smoking status. Of these studies, one (39) clearly showed an association with tobacco exposure. This study, focused on an Asian population, reported that KRAS mutations were associated with ever-smoking status (P = 0.003), male sex (P = 0.009), and poor histologic differentiation (P = 0.037). Two smaller studies of American and European populations failed to find a significant association between smoking status and KRAS mutations, although in both, a trend toward lower KRAS mutational frequency in never smokers was observed (42, 43). A recently published larger study evaluating KRAS mutational frequency in 482 lung adenocarcinomas in an American population also failed to show a statistically significant association between KRAS mutation and either smoking status or gender, although KRAS transversion mutations were strongly associated with smoking status (P < 0.0001; ref. 44). It is unclear whether the evident discrepancy between the Riely and Tam studies represents a true difference in lung adenocarcinoma biology between Asian and European and/or American populations (Fig. 1). An association between KRAS mutation and smoking status was confirmed in a recent large-scale gene-sequencing study of adenocarcinomas (discussed below; ref. 45).
Alternative kinase mutations: EGFR family members, STK11, and EML4-ALK
In addition to activating mutations in EGFR, similar activating mutations in related family members HER2 and HER4 have been found in a smaller fraction (approximately 2%) of lung adenocarcinomas. Gene amplification of HER2 and HER3 have been reported to confer increased sensitivity to EGFR tyrosine kinase inhibitors in lung cancer cell lines (46, 47). Interestingly, EGFR family member mutations, like EGFR and KRAS mutations, seem to be mutually exclusive, presumably representing mechanisms for activation of overlapping or redundant downstream oncogenic pathways (48). HER2 mutation is found at higher frequency among lung cancers of never smokers (P = 0.02; ref. 38). Associations of HER3 mutations with smoking status have not been definitively reported.
STK11 (LBK1) encodes a serine-threonine kinase implicated in cell proliferation and cell survival pathways, and has been found to be mutated in about 11% of NSCLCs (49). STK11 mutations are more frequent in lung cancers from ever smokers than never smokers (P = 0.007). STK11 mutations are more commonly found in KRAS mutant tumors (P = 0.042) and are very rare in EGFR mutant tumors (P = 0.002). The biological basis for these associations has not been defined.
A recently identified translocation results in an in-frame fusion between the EML4 and ALK genes (EML4-ALK) resulting in a fusion protein with preservation of the ALK kinase domain. Among 266 resected NSCLCs in an East Asian population, the EML4-ALK fusion gene was found in about 5% of cases (50). EML4-ALK was associated with younger age of cancer onset (P = 0.018) and with never-smoking status (P = 0.009). EML4-ALK, EGFR, and KRAS mutations were all mutually exclusive, suggesting that EML4-ALK may be an important oncogenic factor, and a potential therapeutic target in EGFR wild-type and KRAS wild-type lung cancer in never smokers.
Many lines of evidence suggest that individual susceptibility to lung carcinogens, including second-hand smoke (SHS), plays a role in the development of lung cancer. Increased susceptibility may have a genetic component, as suggested by consistent reports of familial aggregation of lung cancer (51). A number of studies of familial clustering of lung cancer have been undertaken, most of which report an approximately 1.5- to twofold increased risk associated with having a first-degree relative with lung cancer (51). Fewer studies have been conducted in never smokers. The first study of familial risk of lung cancer was conducted more than 40 years ago by Tokuhata and colleagues (52). This study found that nonsmokers with lung cancer were 40% more likely than nonsmoking controls to report a first-degree relative with lung cancer. Women were more likely than men to report such a family history. Some studies have shown no increased lung cancer risk in never smokers associated with family history of lung cancer in any first-degree relative (53–57), but results vary on the basis of type of relative affected, number of relatives affected, and age or sex of affected relative. Although Brownson and colleagues showed little increase in risk with one family member affected with lung cancer, a 2.6-fold increase (95% CI 1.1-6.2) in risk of lung cancer was seen for never smokers with a family history of five or more first-degree relatives affected (58). Wu and colleagues focused specifically on familial risk in never-smoking women and found that risk of adenocarcinoma of the lung was increased more than threefold with a family history of lung cancer in a mother [odds ratio (OR) 3.24; 95% CI 1.1-9.9) or sister (OR 3.59; 95% CI 1.3-9.8; ref. 54). Young age at diagnosis often suggests an underlying genetic contribution to risk, and several studies have shown that family history of an early onset lung cancer is associated with increased risk of lung cancer among never-smoking relatives. Schwartz and colleagues (56) reported a sixfold increased risk of lung cancer among relatives of never smokers with lung cancer diagnosed between ages 40 and 59 years (95% CI 1.1-33.4), whereas Kreuzer and colleagues (53) found a nonsignificant threefold increase in risk of lung cancer in female never smokers under age 46 years with a family history (OR 3.28; 95% CI 0.71-15.1). A large cohort study in Japan reported that family history of lung cancer in either a parent or sibling was associated with a 2.5-fold increase in lung cancer risk among never smokers (RR = 1.27-4.84; ref. 59). A recent meta-analysis, including 11 studies, evaluated risk associated with family history of lung cancer among never smokers and reported that family history contributed to risk (RR = 1.51; 95% CI 1.11-2.06; ref. 51). In six studies with information on number of relatives affected, lung cancer risk in never smokers was increased 57% (95% CI 1.34-1.84) when one relative was affected, and 2.5-fold when two or more relatives were affected (95% CI 1.72-3.70).
Several limitations are relevant to the interpretation of these studies, the most notable of which are incomplete or often no adjustment for family structure and smoking among relatives, and lack of validation of family histories. Also, when risk among never smokers is of particular interest, sample sizes tend to be small and analyses that detail smoking status of all affected relatives are lacking. Despite these caveats, there is strong evidence of a familial contribution to risk of developing lung cancer among never smokers.
The findings of family aggregation suggest the possibility of one or more susceptibility genes for lung cancer. Family linkage studies have been used successfully to identify highly penetrant, low frequency susceptibility genes for diseases. To date, there is only one ongoing lung cancer family linkage study. This national study, being conducted by the Genetic Epidemiology of Lung Cancer Consortium, is not limited to lung cancer among never smokers, and high risk families often include smoking and never-smoking affected members. The first findings from this study were reported by Bailey-Wilson and colleagues and include linkage of lung cancer to a region on chromosome 6q23-25 (146cM to 164cM; ref. 60). It was also shown that lung cancer risk among putative carriers of the linkage region was increased even in never smokers. The search for a lung cancer gene within this region is ongoing. Only about 1% of lung cancer patients have such extensive family histories (with three or more affected relatives), but 10 to 15% have at least one first-degree relative with the disease.
Candidate gene association studies
Although linkage studies are typically used to detect highly penetrant genes that are rare, alternative approaches such as association studies are used to detect susceptibility genes that are more common, i.e., with minor allele frequencies of 5% or more. Such studies in lung cancer have primarily focused on polymorphisms in genes coding for enzymes involved in phase I and phase II metabolism of carcinogens in tobacco smoke and DNA damage repair. This body of work was recently reviewed, and few consistent results have been identified using this approach, even in large populations primarily comprised of smokers (61). The work in never smokers is limited to a few studies of modest sample size. These studies focus on the same set of candidate genes evaluated in smokers. The rationale for studying polymorphisms in the tobacco metabolism pathway in nonsmokers is that a significant subset of nonsmokers have exposure to SHS, suggesting that some of the underlying mechanisms of lung carcinogenesis may be the same as seen in smokers (62). Conversely, a larger fraction of cancers in never smokers may have no clear relationship to tobacco carcinogens, and inclusion of such cases in analysis makes negative findings somewhat difficult to interpret. It has been suggested that a genetic contribution to risk might be more evident when environmental exposures are low, as in the case of SHS exposure and in never smokers. These populations have generally not been a primary focus of candidate gene analyses: data summarized below are primarily derived from subset analysis of larger studies.
Metabolism of polycyclic aromatic hydrocarbons (PAHs), tobacco-specific nitrosamines, and aromatic amines in cigarette smoke occurs via two classes of enzymes: phase I (oxidation-reduction-hydrolysis) and phase II (conjugation) enzymes. The following phase I and II enzymes have been studied in at least 100 never smokers with lung cancer: CYP1A1, CYP1B1, NAD(P)H quinone oxidoreductase 1 (NQO1), N-acetyltransferase 2 (NAT2), and the glutathione S-transferases (GSTM1, P1, and T1).
CYP1A1 and CYP1B1 are active in the metabolism of PAHs found in tobacco smoke. Two CYP1A1 polymorphisms are the most frequently studied, a T3801C substitution resulting in a MspI restriction site and a A2455G substitution resulting in a Ile462Val change in exon 7. A CYP1B1 Leu432Val polymorphism has also been studied. Glutathione S-transferases occur in a number of classes and act to conjugate electrophilic compounds with glutathione. GSTM1 and GSTT1 occur in null forms resulting in enzyme deficiencies, whereas an Ile105Val polymorphism in GSTP1 is the most frequently studied. NQO1 can act in both carcinogen activation and detoxification and the Pro187Ser polymorphism has been evaluated. Individuals can carry a fast-rapid or slow acetylator phenotype that can be defined by a series of single nucleotide polymorphisms (SNP) in NAT2.
Because never smokers make up a small fraction of all lung cancer cases, some of the larger studies of never smokers have come from consortia. A pooled analysis of 11 primarily European studies by the Genetic Susceptibility to Environmental Carcinogenesis (GSEC) consortium included 130 never-smoking cases and 925 never-smoking controls (63). Never smokers carrying at least one CYP1A1 Val allele at Ile462Val were at twofold (95% CI 1.36-3.13) increased risk of developing lung cancer. The association was stronger in women than in men. A follow-up pooled analysis by GSEC, now including 14 studies and 302 never-smoking cases and 1,631 never-smoking controls, showed a threefold increased risk associated with the CYP1A1 Val allele of Ile462Val among never smokers (95% CI 1.51-5.91), but no effect of the MspI polymorphism or the GSTM1 null genotype (64, 65). In evaluating the results of these 14 studies independently, few of the studies showed positive findings, and only one of the studies included more than 100 never-smoking cases. Only three studies provided data on SHS exposure.
Raimondi and colleagues (66) evaluated the role of GSTT1 in lung cancer in a large pooled and meta-analysis based on 34 studies, which included data on more than 4,000 never smokers. No association was found between the GSTT1 null genotype and lung cancer risk among never smokers in these analyses.
In addition to the pooled analyses, there are few studies of never smokers that include at least 100 cases. Only these larger studies of more than 100 never-smoking lung cancer cases are discussed. None of the polymorphisms in GSTM1, GSTT1, and GSTP1 studied by Wenzlaff and colleagues (67) were associated with risk of lung cancer among never smokers. However, in individuals with 20 or more years of household SHS exposure, carrying the GSTM1 null genotype was associated with a 2.3-fold increased risk (95% CI 1.05-5.13). Risk was further increased in these individuals if they also carried the GSTP1 Val allele at Ile105Val (OR 4.56; 95% CI 1.21-17.21). Similar findings of increased risk among GSTM1 never smokers exposed to SHS were noted in two other studies. In one, GSTM1 null women exposed to 40 or more pack-years of SHS exposure through their husbands were at 2.3-fold (95% CI 1.13-4.57) increased risk of lung cancer (68). In a case-only study, an OR of 2.3 (95% CI 1.15-4.51) was noted in the subset of never-smoking women with lung cancer who were both GSTM1 null and exposed to SHS (69). No significant associations for GSTM1 or GSTT1 and lung cancer risk were reported by Malatas and colleagues (70).
Neither Kiyohara and colleagues in a study of Japanese women, Wenzlaff and colleagues (71) in a population-based study in the United States, nor Bennett and colleagues (72) in a case-only analysis found lung cancer risk associated with the CYP1A1 polymorphisms, even when taking into account SHS exposure. In one of these studies, polymorphisms in the gene coding for CYP1B1 were also evaluated. Carrying at least one Val allele in CYP1B1 Leu432Val was associated with an almost threefold (95% CI 1.63-5.07) increase in risk of lung cancer in Caucasian never smokers. This association was most significant in never smokers with SHS exposure.
Neither Chao and colleagues (73) nor Bock and colleagues (74) report significant associations of the NQO1 Pro187Ser polymorphism with risk of lung cancer in never smokers. In one study, however, the subset of individuals age 50 years or greater carrying the T allele risk was at approximately 50% decreased risk of lung cancer (95% CI 0.27-0.87; ref. 74). The acetylation phenotype was evaluated for its contribution to lung cancer risk in a Taiwanese population of never smokers (75). Several SNPs in NAT2 were genotyped allowing classification of never smokers as either rapid acetylators or slow acetylators. Rapid acetylators were at 2.5-fold increased risk of developing lung cancer (95% CI 1.40-4.23).
In addition to work focused on phase I and II enzyme polymorphisms, studies have looked at the role of SNPs in genes coding for DNA repair enzymes. These studies also suffer from limited sample sizes, particularly of never smokers. In a case-control study nested in a large cohort study, Matullo and colleagues evaluated 16 DNA repair polymorphisms and found no association with risk of lung cancer in never and former smokers (76). Likewise, no significant lung cancer risk has been found to be associated with genotype at polymorphisms in OGG1 (77, 78) among never smokers. Although prior studies have failed to find evident associations between polymorphisms in XRCC1 and lung cancer risk in never smokers (77, 79), one recent case-control study of Chinese never-smoking women reported significant associations with two of five polymorphisms examined (80). One of these two is a silent base change, of questionable biological significance, and these have not been confirmed in other similar (and larger) data sets. O6-alkylguanine DNA aklyltransferse (AGT) repairs DNA adducts caused by the metabolism of N-nitroso compounds in cigarette smoke. Three SNPs in AGT have been studied in never smokers (81). The 143Val/178Arg variant alleles were associated with a twofold (95% CI 1.03-4.07) increased risk of lung cancer in this study. A recent pooled analysis of 14 studies of sequence variants in 12 DNA repair enzymes, including data from 8,454 cases and 9,344 controls (878 never-smoking cases and 3,326 never-smoking controls), revealed only four sequence variants weakly associated with lung cancer risk, none with evident specificity for never-smoking status (82).
Genome-wide association studies
A number of large-scale genomic analyses of lung cancers, including genome-wide association studies characterizing genetic contributors to lung cancer risk in case-control series, and genomic and epigenetic studies characterizing tumor-specific somatic genomic alterations in histologic subtypes of lung cancer, have been recently presented.
High density arrays of SNP loci can be screened to produce detailed profiles of germline alterations associated with increased cancer risk. Remarkably, three recent genome-wide association studies, each involving analysis of more than 300,000 SNPs in several thousand cases and controls, independently converged on the 15q24-25.1 chromosomal locus as a polymorphic site, genomic amplification of which was associated with lung cancer risk (83–85). This region contains genes for nicotinic acetylcholine receptor subunits, suggesting a possible association with nicotine dependence. A subsequent analysis taking into account data from all three studies and additional never-smoking cases suggests that the 15q24-25.1 locus polymorphism is not associated with lung cancer in never smokers, supporting that its primary influence on lung cancer risk may be through an influence on tobacco addiction (86). To date, genome-wide association studies of lung cancer have not defined genetic factors contributing to lung cancer risk in never smokers.
Broad-based screens for both tumor-specific (somatic) epigenetic and genetic alterations have recently been done. Weir and colleagues conducted a large-scale SNP analysis of adenocarcinomas, the most common type of lung cancer in never smokers (87). Although this data set of 371 patients included 82 never smokers, no correlations between amplifications or deletions and smoking status were significant after correction for multiple hypothesis testing. Shames and colleagues published an epigenetic genome-wide screen, evaluating promoter methylation profiles in 107 primary lung cancers (88). Unfortunately this data set included only nine “never smokers or non-smokers,” precluding any assessment of epigenetic alterations specific to lung cancer in the never smoker. Ding and colleagues recently applied high throughput gene sequencing to 188 primary lung adenocarcinomas including 20 never smokers, sequencing 623 genes suggested to play a role in cancer biology (45). The investigators identified a total of 26 genes with sufficiently high frequencies of mutation to implicate them in lung carcinogenesis. The average number of mutations in smokers was significantly higher than in never smokers (P = 0.021), and the authors were able to confirm the previously noted associations between never-smoking status and EGFR mutation (P = 0.0046), and smoking status and both KRAS mutation (P = 0.021) and STK11 mutation (P = 0.044).
Summary: genetic epidemiology
The goal of profiling the genetic characteristics of an individual at high risk for lung cancer is to better understand the underlying biology of lung carcinogenesis in an effort to direct treatment and prevention strategies. This is particularly important to advance the state of knowledge of lung carcinogenesis among never smokers. Familial aggregation of lung cancer has been fairly well characterized and accounts for an approximate 50% increase in risk of lung cancer among never smokers. Family linkage studies, with the goal of identifying a lung cancer susceptibility gene, have included families with both smoking and never-smoking members and have yet to identify such a gene. Although hundreds of candidate gene studies have been conducted in smokers, the specific genes that are associated with alterations in risk remain poorly understood. Fewer studies have focused on never smokers, and these studies have substantial limitations. The studies discussed represent the largest studies conducted, however many are underpowered to detect moderate risks especially when allele frequencies are low and lack information on SHS exposure. Gene-gene or gene-environment interactions have not been studied and require extremely large sample sizes. False positive results are a potential problem and more likely when initially small data sets are stratified by age, race, gender, and histologic type. Larger studies from consortia have the capacity to pool findings across studies to increase sample size and power. These consortia should focus on genetic susceptibility in never smokers, but have their own limitations including population heterogeneity due to significant differences in allelic frequencies between races and ethnicities, differing case and exposure definitions, and differing genotyping methods.
A limited number of candidate SNPs has been studied that represent only some of the variation within a gene, may not be functional, and are unlikely to be acting alone. These analyses have primarily focused on genes encoding carcinogen metabolic enzymes relevant to tobacco carcinogens. Newer approaches that select candidate genes within pathways and genotype at multiple markers within a gene have begun to be applied to lung cancer generally, and to lung cancer in never smokers. Given the complexity of the genome and pathways involved in carcinogenesis, it is likely that there are relevant pathways yet to be fully characterized. The use of new technology to genotype thousands of candidate SNPs per sample is allowing for more complete coverage of the variation within candidate genes in multiple pathways. In addition to candidate gene studies, whole genome association studies offer a powerful approach when relative risks are modest and in which environmental factors play a role. Genome-wide studies in lung cancer have included limited numbers of never smokers. Genomic analyses of tumor-specific alterations, including copy number and mutational studies, have confirmed previously identified mutations associated with never-smoking status, but to date have not defined novel mutations in the etiology of lung cancer of never smokers.
Biomarkers of Exposure
Within the population of never smokers with lung cancer, a critical distinction may exist between cancers related to SHS, and cancers unrelated to tobacco exposure. Limited methods exist for quantitation of SHS exposure by both intensity of exposure and duration. Previous studies have focused on current smokers compared with either exsmokers or never smokers. Studies of never smokers stratified by exposure to passive smoke, indoor pollution, and other risk factors are limited. A comprehensive review of exposure and cancer-related biomarkers of tobacco smoke has been recently published (Supplementary Table S1; ref. 89).
TP53 mutations as markers of exposure
There are several molecular differences in lung cancers from smokers as compared with never smokers (Table 3). One of the most intensively characterized biomarkers of exposure is the TP53 tumor suppressor gene, which is located in chromosome 17p13.1. It encodes a multifunctional p53 phosphoprotein, important in apoptosis, cell cycle regulation, senescence, as well as DNA-repair; for reviews see e.g., (90–92).
Most (if not all) human tumors lack function of p53 either through a gene deletion or mutation or through an indirect or posttranslational mechanism, such as increased expression of negative regulators of p53, like the overexpression of Jab1 (93). In some cases mutations of the TP53 gene, of which 73.7% are missense mutations, show a mutation spectrum implicating carcinogen-specificity (Supplementary Table S2) and may serve as environmental and clinical biomarkers (94–96). There are more than 2,900 TP53 mutations from lung cancer (in 38.6% of the tumors analyzed) listed in the latest release of the IARC-based TP53 mutation database (see also ref. 97).7 Information about smoking and/or nonsmoking can be found in more than 1,000 cases, but about passive smoke exposure in very few cases. Never smoker is not a designation in this database.
Although both TP53 mutation frequency and the number of hotspot mutations are higher in smokers than in never smokers (98), major differences in the mutation spectra are less clear when all smokers are compared with all nonsmokers in the IARC database (Fig. 2). Deletions and insertions (10% in smokers, 5% in nonsmokers) and G:C to T:A mutations (31% in smokers, 26% in nonsmokers) show some difference. The extent to which TP53 mutations observed in nonsmokers reflect exposure to SHS cannot be defined currently because of sparse data related to quantitation of exposure. G:C to T:A transversions, typically induced by the cigarette smoke carcinogen benzo(a)pyrene (BP) in experimental systems (99), are very common at CpG sites in smoking-related lung cancer, and their frequency increases with increased smoking (100–102). One report (103) suggests a higher TP53 mutation frequency in smokers with asbestos exposure compared with smokers without asbestos. Experimental studies by Pfeifer and coworkers (104–109) support a PAH-specific fingerprint in TP53 mutation spectrum in smokers (102).
A definitive demonstration of a specific TP53 mutation spectrum related to smoking requires the comparison of the mutation spectra from smokers and never smokers. Both possible gender (98, 110) and geographical differences (96, 101) have to be taken on account in such studies. The few existing studies of never smokers support the hypothesis of smoking-related TP53 mutations, with a higher frequency in smokers than in never smokers (43, 111). Toyooka and coworkers (98) analyzed in detail the R6 version of IARC database and found that the G:C to T:A difference between smokers and nonsmokers can be entirely accounted for by the difference in lung cancers in women, with a 13% frequency in never smokers compared with 36% in smokers; lung cancers in men showed no difference. These results were interpreted as suggestive of higher sensitivity of women to tobacco carcinogens.
Even fewer articles have evaluated the effect of former smoking or exposure to SHS on TP53 mutations. Lung cancers from passive (112) and former smokers (43) have a higher prevalence of TP53 mutations than cases with no history of tobacco exposure, and lower prevalence than that of current smokers (111). That a different TP53 mutation frequency and spectrum between smokers and nonsmokers has also been described in colorectal (113) and in bladder cancer (114) gives support to the idea that TP53 mutations may be useful as a biomarker of smoking etiology of cancer. An ongoing challenge is to design studies incorporating reliable data on exposure to SHS.
An interesting correlation has recently emerged between analysis of p53 pathway and EGFR pathway mutations in lung cancer in never smokers. The factor p14ARF complexes with and inhibits Mdm2, the primary regulator of p53 stability. p14ARF deficiency leads to secondary p53 instability and decreased p53 function. Mounawar and colleagues analyzed a series of cases in the IARC database for mutations in factors including EGFR, p53, and p14ARF (115). Downregulation of p14ARF expression was more frequently observed in tumors of never smokers (62.5%) than ever smokers (35%; P = 0.0008). Among never smokers, 11 of 16 EGFR mutant tumors were also TP53 mutant, whereas only 2 of 17 EGFR wild-type tumors had TP53 mutations (P = 0.0008). All EGFR mutant tumors with wild-type TP53 showed suppression of p14ARF expression. Taken together these data suggest a functional interaction between loss of p53 activity and dysregulation of EGFR in lung cancer in never smokers.
As discussed in the accompanying CCR Focus article, the high incidence of lung cancer among nonsmoking Chinese women has been closely associated with the use of smoky coal (116–119) with a dose-dependent increase in the risk (120, 121). Smoky coal emissions are very rich in PAH-compounds (43% of organic emissions; ref. 122). Consequently, there is a higher number of BP-DNA adducts in the bronchoalveolar-lavage cells, peripheral blood mononuclear cells, and placentas of smoky coal exposed women than in nonexposed controls (123). Supporting the BP-specific mutation spectrum in smokers, DeMarini and coworkers (124) found the TP53 mutations in the lung cancer tissue of these women clustering in codons 153 to 158, with most of the mutations being G to T, and with all G to T transversions found on the nontranscribed strand. A clear difference between lung cancers associated with coal smoke versus tobacco smoke was the TP53 codon 154, a hotspot for PAH adducts (104, 106), but not associated with lung cancers from smokers (102). In general, both the frequency and type of mutations (primarily GC to TA transversions) induced by smoky coal, cigarette smoke condensate, and BP in bacterial strains are similar to those found in lung cancers from nonsmoking women exposed to smoky coal (122).
Taylor and coworkers (125) originally reported the intriguing finding of a TP53 hotspot in codon 249 in radon-exposed uranium miners from Colorado (Table 4). Among the 52 studied lung cancers, 16 contained an AGG-ATG transversion leading to an amino acid change (arg to met), distinct from the aflatoxin-related mutation in liver cancers at the same locus (AGG-AGT, arg to ser; refs. 126, 127). However, other studies of uranium miners with radon exposure (128–132), or of other environmentally exposed populations (43, 133, 134) have found very few or no codon 249 AGG-ATG mutations in lung cancers. The predominant type of mutations induced by ionizing radiation in mammalian cells are large deletions or translocations (127). An alternative hypothesis about mycotoxins contributing to the peculiar codon 249 mutation found by Taylor and coworkers (125) has been proposed (135). Although the radon-induced mutation spectrum in lung cancer may be different from the mutation spectrum in cigarette smoke-induced lung cancer from people not exposed to radon, no definitive mutation or set of mutations typical of radon has been defined.
Other biomarkers of exposure
Highly specific and sensitive analytical assays of metabolites of nicotine- and tobacco-specific N-nitrosamines have been developed that can measure these metabolites in biofluids from infants and adults exposed to SHS. Of these, the most commonly used as a biomarker of exposure is cotinine, which has been extensively applied to quantitate exposure in nonsmokers (136, 137). This analytical approach is limited to recent exposure and does not measure lifetime exposure to tobacco smoke.
Carcinogen-DNA or -protein adducts are also measures of tobacco smoke exposure. Blood hemoglobin adducts with 4-aminobiphenyl, acrylamide, ethylene oxide and acrylonitrile are higher in current smokers than in never smokers (138–141), but are not specific for tobacco smokers. DNA adducts have also been measured by the sensitive and nonspecific 32P-postlabeling assay, but this method also has problems with quantitation (89).
Detection of mutagens in urine using Salmonella typhimurium strains has been widely used and have shown the mutagenicity to be generally dependent on tobacco smoking (142). However, confounding factors such as diet affect urinary mutagenicity. The specific chemicals in tobacco smoke that cause the DNA damage and subsequent mutations are unknown.
Chromosomal assays, e.g., frequencies of sister chromatid exchanges in blood lymphocytes are generally increased in smokers (142), but are not specific for chemicals in tobacco smoke.
We agree with the conclusion of Hatsukami and colleagues (ref. 89; p. 604), that there exists “…no comprehensive set of biomarkers of carcinogen exposure or biological effects as a predictive measure of the total carcinogenicity related to exposure to tobacco or tobacco smoke.” Definition and application of highly sensitive and specific methods for quantitating both acute and chronic SHS exposure are clearly primary research needs in relationship to lung cancer in never smokers.
Summary and Conclusions
The emergence of targeted therapies with a differential benefit in clinically and molecularly defined categories of lung cancer emphasizes the need for close interaction between epidemiologists, laboratory scientists, and clinicians to promote better treatment for distinct categories of patients with this disease. Lung cancers can be divided into subgroups based on both differential natural history and differential response to therapeutic response among different patient groups. A critical dichotomy in lung cancer seems to be that between ever smokers and never smokers. These categories have both prognostic and therapeutic implications. The etiologic differences between lung cancer in ever and never smokers, in terms of underlying genetic risk factors and differential pathways of molecular carcinogenesis, are beginning to be defined, but have only recently become a sharp focus of investigation.
Although lung cancer in never smokers would rank independently among the 10 most common causes of cancer mortality, there has been a relative paucity of attention to this important patient population. Despite a few published reports in the field, familial aggregation and genetic linkage studies in lung cancer have generally not focused on never-smoking cohorts as a separate entity. Similarly, candidate gene association studies in lung cancer have primarily focused on genes relevant to pathways involved in tobacco carcinogen metabolism, which may be of less immediate relevance to genetic contributors to lung cancer risk in the absence of tobacco smoke. Genome-wide analyses of lung cancer in never smokers specifically may be of great interest in defining genetic factors augmenting lung cancer risk. Risk factors identified in never smokers, in which the overwhelming effects of tobacco-induced carcinogenesis are minimized, are likely to be distinct from those currently identified in relation to tobacco metabolic pathways, and may help define lung cancer risk both in the presence and absence of tobacco exposure.
Key tumor suppressor genes and proto-oncogenes with mutational profiles that differ between lung cancers in never smokers and ever smokers include TP53, KRAS, EGFR, STK11, and EML4-ALK. TP53 mutations that show a dose-dependent increase in frequency with tobacco smoke exposure include G to T transversions at hotspots directly influenced by known tobacco carcinogens. However, there is sufficient overlap between the TP53 mutational spectra of lung cancers in smokers and never smokers to preclude differentiating tumors solely on the basis of TP53 genotype. Mutations in KRAS and in EGFR are mutually exclusive, and seem to represent alternative oncogenic pathways to transformation, both resulting in activation of the downstream MAPK pathway as well as other pathways regulating proliferation and survival. EGFR mutations are relatively common in lung cancer in never smokers, but rare in lung cancers in smokers. Small molecule inhibitors of the EGFR tyrosine kinase, such as erlotinib and gefitinib, show very high response rates and offer significant clinical benefit against tumors with activating mutations of EGFR. In contrast, tumors with constitutively activating mutations of KRAS (predominantly active or former smokers) show essentially no objective responses to these drugs. Similar promising data are now emerging for treatment of EML4-ALK mutant lung cancer with targeted inhibitors of ALK.
The biased distribution of KRAS and EGFR mutations in ever and never smokers and the differential clinical benefit from EGFR-directed therapy in these patient subsets have spurred significant interest in the issue of lung cancer in never smokers. However, KRAS and EGFR mutant tumors together represent a minority of NSCLCs in both ever- and never-smoking patients.
Molecular pathways of lung carcinogenesis are emerging and need additional definition about pathways particularly relevant to the never smoker. Evidence in hand suggests that many of the molecular pathways in lung cancer in never smokers are different from those typically implicated in smokers. Supportive evidence includes the observed differences in histology of lung cancers (higher frequency of adenocarcinomas in never smokers than in smokers), differences in mutational spectra found in the tumors, and differences in response to therapy. Recent estimates based on pooled data from 17 published reports, totaling more than 26,000 cases, suggests that the ratio of adenocarcinoma to squamous cell carcinoma is approximately 0.4 in lung cancers in smokers, compared with 3.4 in never smokers (143).
Mutational spectra in several genes differ in lung cancer of never smokers compared with smokers. Most of the current literature has focused on TP53, EGFR, and KRAS mutations. Other critical genes and/or pathways will need to be defined. Defining the many other genetic and molecular differences that distinguish tobacco-driven tumorigenesis from the development of lung cancers in never smokers is of critical importance in identifying alternative risk factors for lung cancer, of particular but not exclusive relevance to never smokers. In addition, the lessons of EGFR-targeted therapies teach us that defining these differences at a molecular level may identify important therapeutic strategies for targeting key oncogenic events in the thousands of patients with this disease.
Disclosure of Potential Conflicts of Interest
No potential conflicts of interest were disclosed.
- Received February 12, 2009.
- Revision received June 17, 2009.
- Accepted June 24, 2009.