
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
Imaging, Diagnosis, Prognosis |
Authors' Affiliations: Departments of 1 Surgery, 2 Biostatistics Bioinformatics, and Epidemiology and 3 Pathology and Laboratory Medicine, 4 College of Medicine, Medical University of South Carolina, Charleston, South Carolina
Requests for reprints: Kaidi Mikhitarian, Department of Surgery, Medical University of South Carolina, 96 Jonathan Lucas Street, Suite 420, P.O. Box 250613, Charleston, SC 29425. Phone: 843-792-7789; Fax: 843-792-4813; E-mail: mikhitar{at}musc.edu.
| Abstract |
|---|
|
|
|---|
Key Words: real-time PCR artificial neural network (ANN) mammaglobin (mam) trefoil factor 1 (TFF1)
We recently reported interim results of the Minimally Invasive Molecular Staging of Breast Cancer Trial (MIMS), a prospective cohort study designed to define the clinical significance of molecular detection of micrometastatic breast cancer in ALN (8). ALN from 489 patients with T1 to T3 primary breast cancers were analyzed by standard histopathology (H&E staining) and by multimarker, real-time reverse transcription-PCR (RT-PCR) for the following genes: mam, mamB, muc1, CEA, PDEF, CK19, and PIP. The interim results indicate that overexpression of breast cancerassociated genes in breast cancer subjects with pathology-negative ALN correlates with traditional predictors of disease progression, providing strong evidence that molecular markers serve as valid surrogates for the detection of micrometastatic breast cancer (8). Despite these positive results, one of the surprising findings from the interim analysis is that whereas the majority of molecular markers were informative for the detection of metastatic breast cancer, only a few were informative for the detection of micrometastatic breast cancer. Of particular interest, the molecular marker [mammaglobin (mam)] that was most highly expressed in pathology-positive ALN also had the highest apparent sensitivity for the detection of micrometastatic disease. Recent IHC studies have shown that mam is a valid surrogate of micrometastatic disease (9).
In this study, using quantitative real-time RT-PCR data from the MIMS Trial, we did a rigorous statistical analysis of the relationship between relative levels of gene expression and the ability of individual molecular markers to detect micrometastatic breast cancer. The result of this analysis is a statistical validation of the concept that the most informative markers for detection of micrometastatic disease are those that are most highly expressed in metastatic disease. To further test this hypothesis, we developed an innovative microarray strategy to identify genes that are most likely to be informative for the detection of micrometastatic disease.
| Materials and Methods |
|---|
|
|
|---|
Artificial neural network development and generation of frequency distribution analyses. Frequency distribution analyses were done using an artificial neural network (ANN) developed by one of the authors (J.S.A.) following published guidelines (10). The guidelines consist of using bootstrapped cross-validation for automatic identification of feedforward ANN models, including both regression early stopping and optimal topology selection, whereas avoiding model overfitting. The ANN model is detailed in Eq. A, where w1, w2, b1, and b2 are variable vectors with the length equal to the number of hidden nodes in the optimal topology of the ANN model.
![]() | (A) |
Because the ANN can be expressed as an algebraic expression, its symbolic derivative, dp(
Ct)/d
Ct, can also be used directly to generate the frequency distribution of a given molecular marker. The derivative of the ANN is described in Eq. B. The m-files (in MATLAB code) implementing Eq. A and its symbolic derivative can be obtained from J.S.A. at almeidaj{at}musc.edu.
![]() | (B) |
Statistical analyses. Regression analyses were done using SAS version 9.0 Software (SAS Institute, Inc., Cary, NC).
RNA isolation for dilutional microarray analysis. For microarray analysis, we used a metastatic ALN in which mam was overexpressed at a level 5.3 x 107-fold higher than the mean expression in normal lymph nodes. In addition, four normal lymph nodes were used. Quality and quantification of RNA was assessed by an Agilent 2100 Bioanalyzer System (Agilent Technologies, Inc., Palo Alto, CA). RNA from the metastatic lymph node was diluted into a pool of normal lymph node RNA at ratios of 1:50, 1:2,500, and 1:125,000. For all of these conditions, expression values were obtained for a total of 22,283 gene transcripts spotted on an Affymetrix U133A array. Total cellular RNA was isolated as follows:
0.15 g of lymph node tissue was homogenized in 1 mL of RNA STAT-60 (TEL-TEST, Friendswood, TX) using a model 395 type 5 polytron (Dremel, Racine, WI). Total RNA isolation was done as per manufacturer's instructions up to the aqueous phase separation. Aqueous phase containing RNA was removed from organic phase and mixed with an equal volume of 70% ethanol. The sample was then loaded into an Rneasy Mini column (Qiagen, Valencia, CA) and purified according to the manufacturer's protocol. The RNA pellet was dissolved in 50 µL of RNase-free water.
GeneChip microarray analysis. Expression levels of 22,283 gene transcripts were determined on oligonucleotide microarrays using: (a) pooled RNA from four normal lymph nodes, (b) RNA from an ALN with a large breast cancer metastasis, and (c) RNA from an ALN with a large breast cancer metastasis diluted into pooled normal lymph node RNA at dilutions of 1:50, 1:2,500, and 1:125,000. Eight microgram of total RNA per sample was used for microarray analysis. First- and second-strand cDNA synthesis, double-stranded cDNA cleanup, biotin-labeled cRNA synthesis, cleanup, and fragmentation were done according to protocols in the Affymetrix GeneChip Expression Analysis technical manual (Affymetrix, Santa Clara, CA). Microarray analysis was done by the DNA Microarray and Bioinformatics Core Facility at the Medical University of South Carolina using U133A GeneChips (Affymetrix). Fluorescent images of hybridized microarrays were obtained by using a HP GeneArray scanner (Affymetrix). For normalization, the microarray office suite was used such that all fluorescence values were multiplied by a factor that resulted in a mean fluorescent score for all genes equal to 150.
Real-time reverse transcription-PCR validation of dilutional microarray analysis on frozen tissue samples. Twenty H&E (+) ALN, 40 control cervical lymph nodes, and 72 H&E ()/PCR (+) ALN were used in this study. Frozen tissue specimens were obtained as part of the MIMS Trial, which was approved by the Institutional Review Board at the Medical University of South Carolina and all participating institutions. mRNA sequences of genes identified in this study were retrieved from the National Center for Biotechnology Information database. Intron-spanning primers were designed and tested in breast cancer cell lines MDA-MB-231 or SK-BR-3: TFF1 forward 5'-AATGGCCACCATGGAGAACA-3', reverse 5'-ACCACAATTCTGTCTTTCACGG-3'; TFF3 forward 5'-TTTGACTCCAGGATCCCTGGAG-3', reverse 5'-AGGTGCCTCAGAAGGTGCATTC-3'; PRO1708 forward 5'-AAGAATGCCCTGTGCAGAAGAC-3', reverse 5'-TTCTGTGCAGCATTTGGTGACT-3'; Lipophilin B forward 5'-ACGGATCAGATGTCCCTTCAG-3', reverse 5'-TTGAAAGACAGTGGAAACCAGG-3'; FBJ forward 5'-CGTTGTGAAGACCATGACAGGA-3', reverse 5'-TCCTTTCCCTTCGGATTCTCC-3'. Primers to the mam gene has been previously described (11, 12). cDNA was made from 5 µg of total RNA using 200 units of Moloney murine leukemia virus reverse transcriptase (Promega, Madison, WI) and 0.5 µg Oligo (dT)12-16 in a reaction volume of 20 µL (10 minutes at 70°C, 50 minutes at 42°C, and 15 minutes at 70°C). Real-time RT-PCR analysis was done on a PE Biosystems Gene Amp 5700 Sequence Detection System (Foster City, CA). The standard reaction volume was 10 µL and contained 1x QuantiTect SYBR Green PCR Master Mix (Qiagen), 0.1 unit AmpErase UNG enzyme (PE Biosystems); 0.7 µL cDNA template; and 0.25 µmol/L of both forward and reverse primer. The initial step of PCR was 2 minutes at 50°C for AmpErase UNG activation, followed by a 15-minute hold at 95°C. Cycles (n = 40) consisted of a 15-second denaturation step at 95°C followed by a 1-minute annealing/extension step at 60°C. The final step was a 60°C incubation for 1 minute. All reactions were done in triplicate.
Real-time reverse transcription-PCR validation of dilutional microarray analysis on paraffin-embedded tissue samples. A 20- to 50-µm section was cut from nine H&E (+) ALN tissue blocks for mRNA extraction following the method of Specht et al. (13). An adjacent 5-µm section was cut for standard H&E staining and examined by a pathologist to confirm the presence or absence of metastatic breast cancer. Briefly, paraffin-embedded tissue sections were deparaffinized twice with 1 mL of xylene at 37°C or room temperature for 10 minutes. The pellet was subsequently washed with 1 mL of 100%, 90%, and 70% of ethanol and air-dried at room temperature for 2 hours. The pellet was resuspended in 200 µL of RNA lysis buffer [2% lauryl sulfate, 10 mmol/L Tris-HCl (pH 8.0), and 0.1 mmol/L EDTA] and 100 µg of proteinase K and incubated at 60°C for 16 hours. RNA was extracted using 1 mL of phenol/chloroform (5:1) solution (Sigma, St. Louis, MO). The aqueous layer containing RNA was transferred to a new 1.5-mL tube. Phenol/chloroform extraction was done a total of three times. RNA was precipitated with an equal volume of isopropanol, 0.1 volume of 3 mol/L sodium acetate, and 100 µg of glycogen at 20°C for 16 hours. After centrifugation at 12,000 rpm for 15 minutes (4°C), the RNA pellet was washed with 70% of ethanol and air-dried at room temperature for 2 hours. Finally, the pellet was dissolved in 12 µL of DEPC water. cDNA synthesis was done as described above with an exception that 500 ng of a panel of truncated gene-specific primers were used instead of oligo(dT)12-16. Truncated gene-specific primers for reverse transcription were designed to correspond to the 5'-end of reverse primer designed for real-time PCR: TFF1 5'-ACCACAATTCTGTCTT-3', TFF3 5'-AGGTGCCTCAGAAG-3', PRO1708 5'-TTCTGTGCAGCAT-3', Lipophilin B 5'-TTGAAAGACAGTGGAA-3', FBJ 5'-TCCTTTCCCTTCGG-3', and mam as described previously (14).
| Results |
|---|
|
|
|---|
|
Relative levels of gene expression for individual molecular markers are correlated with apparent sensitivity for the detection of micrometastatic breast cancer. Using the mean expression values for metastatic breast cancer (Fig. 1, peak 1) and normal lymph nodes (Fig. 1, arrowhead), we were able to calculate the relative levels of gene expression (RLGE) for all seven markers using the 2
Ct method (ref. 15; Table 1). Muc1 had the lowest RLGE value (3.6 x 102), whereas mam had the highest (1.9 x 106).
|
|
|
To test this hypothesis, we did a microarray analysis whereby RNA isolated from a highly metastatic (breast cancer) ALN was diluted into normal lymph node RNA as described in Materials and Methods. Candidate breast cancerassociated genes from this analysis were then selected based on the following criteria: (a) absence of expression in the pooled normal lymph nodes, (b) a fluorescence signal that was above 500 relative units for the undiluted breast cancer sample, and (c) a fluorescence signal that was present in the 1:50 dilution. The per cent of genes that met each respective criterion were 52%, 8.1%, and 52%. Median relative fluorescent value for all genes was 74. Seventy-one genes were identified by criteria a and b, whereas 34 genes were identified by criteria a, b, and c. The 34 genes were sorted by relative intensity of metastatic signal and the top 15 are listed in Table 2, along with genes that we have used in the past for molecular detection of micrometastatic disease (indicated in bold). Of note, of the 34 genes identified by criteria a, b, and c, only mam and trefoil factor 1 (TFF1) had fluorescence signals above 1,000 fluorescent units in the 1:50 dilution. These results suggest that both the mam and TFF1 genes may be informative molecular markers for the detection of micrometastatic breast cancer. The gene with the highest relative intensity was mam, a result that is consistent with results from the MIMS Trial, where mam was noted to be the molecular marker that was most highly expressed in ALN containing metastatic breast cancer, as well as being the most informative marker for the detection of micrometastatic breast cancer (8).
A closer examination of the results of the microarray analyses in this study confirms limitations of a standard (undiluted) microarray approach to gene identification. The fluorescence signal for mam was 6,348 in the undiluted sample, 1,335 in the 1:50 dilution, 38 at the 1:2,500 dilution, and at background levels in the 1:125,000 dilution. However, based on real-time RT-PCR measurements, we determined that mam was overexpressed in this particular ALN at a level 5.3 x 107-fold higher than the mean expression in normal lymph nodes. We can conclude, therefore, that without dilution, mam is in the saturated range, whereas at the 1:50 and 1:2,500 dilutions, mam is at the upper and lower end of the linear detection range, respectively. Based on these findings, we conclude that for highly expressed genes, the hybridization signal at the undiluted level is likely to become saturated and is unlikely to be proportional to gene copy number.
Real-time reverse transcription-PCR confirms that TFF1 is highly expressed in axillary lymph nodes containing metastatic breast cancer. To determine whether TFF1 and/or other markers identified by dilutional microarray analysis were potentially useful for the detection of metastatic and/or micrometastatic breast cancer, we selected the five most highly expressed genes (TFF1, TFF3, PRO1708, Lipophilin B, and FBJ) for further analyses. Primers (see Materials and Methods) were designed and validated using cDNA prepared from the breast cancer cell lines MDA-MB-361 and/or MDA-MB-231. Gene expression levels were determined in ALNs containing metastatic breast cancer (n = 20), as well as in control lymph nodes (n = 8; n = 40 for TFF1; Fig. 3). In control lymph nodes,
Ct values for TFF1 ranged from 16.3 to 25.2 (mean, 22.4 ± 2.1) providing evidence that this gene is expressed poorly in normal tissue. In contrast, the
Ct values for TFF1 in pathology-positive lymph nodes ranged from 2.6 to 23.7 (mean, 12.7 ± 8.1). Using a threshold of three SDs beyond the mean of control lymph nodes, we observed that at least one marker was overexpressed in 17 of 20 (85%) metastatic lymph nodes. TFF1 was overexpressed in 10 of 20 (50%) metastatic ALN, providing evidence that this gene may be an informative marker for detection of metastatic disease. Consistent with previous studies (8, 11), mam was overexpressed in 14 of 20 (70%) samples. Of note, of the three samples that were marker positive but negative for mam, one was positive for TFF1 and two were positive for FBJ. Of the remaining candidate genes, lipophilin B seemed a potentially informative marker and was overexpressed in 9 of 20 (45%) specimens. The sensitivities of the other markers tested were as follows: FBJ 6 of 20 (30%), TFF3 4 of 20 (20%), and PRO1708 1 of 20 (5%). Although TFF1 was not as sensitive as mam, the level of overexpression of TFF1 was comparable to the level of overexpression observed with mam (overexpression in individual ALN of up to 1.1 x 105 and 1.0 x 106, respectively). Because TFF1 was the only gene whose level of detection for metastatic disease was not significantly different from mam at a P < 0.05 (
2 test; data not shown), we chose to analyze this gene in further detail.
|
|
| Discussion |
|---|
|
|
|---|
Given this as a background, we did a statistical analysis of the molecular data from the MIMS Trial to determine predictors of informative markers for the detection of micrometastatic disease. Specifically, the data from the MIMS Trial provides a unique opportunity to explore the relationship between relative levels of gene expression in metastatic breast cancer and the ability to detect micrometastatic breast cancer. The large sample size (n = 489 breast cancer subjects) and quantitative data generated in the MIMS Trial have made it possible for the first time to develop artificial neural networks capable of generating statistically meaningful, accurate frequency distribution analyses of molecular markers known to be associated with breast cancer. In this article, we generated frequency distribution analyses from three different populations: subjects with pathology-positive ALN, subjects with pathology-negative ALN, and control subjects with no evidence of malignancy. The frequency distribution analyses seem internally consistent; the pathology-positive and pathology-negative populations are bimodal, and the peaks representing the low-expressing populations correspond to the mean of the control populations (Fig. 1). Furthermore, based on these frequency distribution analyses, we were able to calculate RLGE values for each molecular marker, a surrogate for the degree of gene overexpression observed in metastatic breast cancer. Finally, we did logistic regression analyses that seem to confirm the hypothesis that relative levels of gene overexpression in metastatic tissue are associated with apparent sensitivity for detection of micrometastatic disease.
Based on this hypothesis, we proceeded with the development of a novel microarray strategy for the rapid identification of molecular markers that are informative for the detection of micrometastic breast cancer. Microarray analysis has proven to be a powerful tool for studying the mRNA expression profiles of normal and neoplastic tissues. However, the ability of this technology to identify informative molecular markers for the detection of micrometastatic disease has been limited. One major limitation of microarray analysis is that it is only semiquantitative. Thus, it is often difficult to determine which of several hundred candidate genes are likely to be most informative for detection of micrometastatic disease. We reasoned that the identification of informative markers for the detection of micrometastatic disease can be simplified by dilution of metastatic tissue (or RNA) into an excess of normal tissue (or RNA).
For our analyses, RNA from a metastatic lymph node was extracted and serially diluted into a pool of normal lymph node RNA at ratios of 1:50, 1:2,500, and 1:125,000. By virtue of this dilution strategy, we were able to rapidly identify those genes that were overexpressed at the highest level in metastatic tissue. Candidate marker genes were chosen based on three selection criteria and validated by real-time RT-PCR using experimental and control lymph nodes. Of the 22,283 genes contained on the Affymetrix U133-A chip, 34 genes met the three selection criteria. The most highly overexpressed gene was mam, a result that was consistent with our previous studies. Besides mam, only trefoil factor 1 (TFF1) had a signal in the 1:50 dilution that was above 1,000 relative fluorescent units. In fact, the intensity signals of TFF1 were similar to those of mam. The results of our analyses suggest that of the dilutions used, the 1:50 was the most informative for identification of genes useful for detection of micrometastatic disease. Real-time RT-PCR analyses of pathology-negative ALN nodes that had shown cancer-associated gene overexpression in the MIMS study (n = 72) confirm that of all the markers tested, mam and TFF1 have the highest apparent sensitivity for detection of micrometastatic breast cancer (Fig. 4B).
TFF1, also known as gastrointestinal trefoil protein pS2, breast cancer estrogen-inducible sequence (BCEI), pNR-2, and Md2, is a secretory polypeptide encoded by a gene in chromosome 21q22.3. TFF1 is involved in the formation of mucus and is highly expressed in stomach epithelium. Interestingly, TFF1 expression has also been found in the regeneration stage of ulcerative and inflammatory gastrointestinal disorders and in various human carcinomas including breast carcinoma (17). In breast cancer, TFF1 is regulated by estrogen and there is evidence that it can be used as a surrogate indicator for the response to antihormonal therapy and more favorable outcome. For example, Gillesby et al. showed that TFF1 mRNA levels in breast cancer were positively associated with both estrogen receptor and progesterone receptor status and that TFF1 was primarily expressed in small (T = 2.0 cm) but well-differentiated tumors (grade 1 and 2; ref. 18). In support of a prognostic role of TFF1 in breast cancer, Thompson et al. reported that the combination of lymph node status and TFF1 expression (analyzed by Northern blot) discriminated patients with good prognosis (node negative, TFF1 positive) and patients with poor prognosis (node positive, TFF1 negative; ref. 19). However, other studies suggest that high levels of TFF1 expression may promote cancer cell invasion (particularly in interval cancers; ref. 20) and may be involved in establishing distant metastasis (21). To our knowledge, only one research group (van't Veer et al.) has studied TFF-1 and TFF-3 (also called p1B) as diagnostic markers for detection of metastatic breast cancer in axillary nodes (22) and in peripheral blood (23). Contrary to our results, their data indicate that TFF-3 is superior to TFF-1 for detection of metastatic disease. Although we suspect that the discrepancy is due to selection of a limited set of control samples and/or threshold values that were too high or too low, we cannot rule out the possibility that selection of primer sequences may play a role.
In conclusion, we have been able to provide a meaningful statistical evaluation of the concept that relative levels of gene expression/overexpression are correlated with the ability to detect micrometastatic disease. Furthermore, we have used this information to design an innovative microarray strategy for the rapid identification of marker genes that can be used for the molecular detection of micrometastatic cancer. The microarray analyses did confirm that mam is one of the most valuable molecular markers for the detection of micrometastatic breast cancer and have identified TFF1 as another highly informative marker. These results underscore the importance of relative gene expression levels in evaluation of candidate molecular markers for molecular detection of cancer.
| Acknowledgments |
|---|
| Footnotes |
|---|
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
5 MK et al., unpublished results. ![]()
Received 10/25/04; revised 2/ 4/05; accepted 2/24/05.
| References |
|---|
|
|
|---|
C(T)) method. Methods 2001;25:4028.[CrossRef][Medline]
This article has been cited by other articles:
![]() |
M. Lacroix Significance, detection and markers of disseminated breast cancer cells Endocr. Relat. Cancer, December 1, 2006; 13(4): 1033 - 1067. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Smid, Y. Wang, J. G.M. Klijn, A. M. Sieuwerts, Y. Zhang, D. Atkins, J. W.M. Martens, and J. A. Foekens Genes Associated With Breast Cancer Metastatic to Bone J. Clin. Oncol., May 20, 2006; 24(15): 2261 - 2267. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. N. Span, F. C.G.J. Sweep, K. Mikhitarian, W. E. Gillanders, D. J. Cole, M. Mitas, and M. Reinholz Mammaglobin as Molecular Marker of Breast Cancer (Micro)Metastases Clin. Cancer Res., October 1, 2005; 11(19): 7043 - 7044. [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| Cancer Research | Clinical Cancer Research |
| Cancer Epidemiology Biomarkers & Prevention | Molecular Cancer Therapeutics |
| Molecular Cancer Research | Cancer Prevention Research |
| Cancer Prevention Journals Portal | Cancer Reviews Online |
| Annual Meeting Education Book | Cell Growth & Differentiation |