Purpose: Molecular characterization of circulating tumor cells (CTC) holds great promise. Unfortunately, routinely isolated CTC fractions currently still contain contaminating leukocytes, which makes CTC-specific molecular characterization extremely challenging. In this study, we determined mRNA and microRNA (miRNA) expression of potentially CTC-specific genes that are considered to be clinically relevant in breast cancer.
Experimental Design: CTCs were isolated with the epithelial cell adhesion molecule–based CellSearch Profile Kit. Selected genes were measured by real-time reverse transcriptase PCR in CTCs of 50 metastatic breast cancer patients collected before starting first-line systemic therapy in blood from 53 healthy blood donors (HBD) and in primary tumors of 8 of the patients. The molecular profiles were associated with CTC counts and clinical parameters and compared with the profiles generated from the corresponding primary tumors.
Results: We identified 55 mRNAs and 10 miRNAs more abundantly expressed in samples from 32 patients with at least 5 CTCs in 7.5 mL of blood compared with samples from 9 patients without detectable CTCs and HBDs. Clustering analysis resulted in 4 different patient clusters characterized by 5 distinct gene clusters. Twice the number of patients from cluster 2 to 4 had developed both visceral and nonvisceral metastases. Comparing transcript levels in CTCs with those measured in corresponding primary tumors showed clinically relevant discrepancies in estrogen receptor and HER2 levels.
Conclusions: Our study shows that molecular profiling of low numbers of CTCs in a high background of leukocytes is feasible and shows promise for further studies on the clinical relevance of molecular characterization of CTCs. Clin Cancer Res; 17(11); 3600–18. ©2011 AACR.
This article is featured in Highlights of This Issue, p. 3507
Metastases, which may develop several years after occurrence of the primary tumor and after prior (neo)adjuvant therapy, can differ greatly from primary tumor tissue in terms of genetic characteristics. Taking biopsies from metastases in patients, however, is an invasive procedure and frequently impossible due to the lack of accessible lesions. Circulating tumor cells (CTC) are tumor cells shed from either the primary tumor or its metastases that circulate in the peripheral blood of patients and can thus be regarded as “liquid biopsies” of metastasizing cells. In this study, we show for the first time the feasibility of extensive molecular characterization of CTCs at both the mRNA and microRNA level in a high background of leukocytes and show its applicability in a cohort of 50 metastatic breast cancer patients. It is anticipated that such an extensive molecular characterization of CTCs will improve the currently available prognostic and predictive models on the basis of primary tissue.
Molecular characterization of primary tumors has already greatly contributed to the personalized treatment of cancer patients. High-throughput techniques have yielded the knowledge of mutations or epigenomic changes in certain genes and prognostic and predictive models on the basis of mRNA and microRNA (miRNA) expression profiles (1–6). Combined with classical tumor characteristics, these models are increasingly used to guide individualized treatment of patients, thereby aiming to avoid over- or undertreatment. However, most of these prognostic and predictive models have been developed based on primary tumor tissue, whereas metastases, rather than the primary tumor, determine the clinical outcome of cancer patients. It has been shown that metastases, which may develop several years after occurrence of the primary tumor and after prior systemic therapy in the adjuvant or neoadjuvant setting, can differ greatly from primary tumor tissue in terms of genetic characteristics (7–13). It is therefore anticipated that molecular characterization of metastases will improve the currently available prognostic and predictive models. Taking biopsies from metastases in patients, however, is an invasive and often painful procedure, and frequently impossible due to the lack of accessible lesions.
Circulating tumor cells (CTC) are found in the peripheral blood of patients and are shed from either the primary tumor or its metastases. A recently developed technology to quantify the number of CTCs in whole blood (WB) is the CellSearch CTC Test (Veridex LLC). So far, this is the only test that has been approved by the U.S. Food and Drug Administration (FDA; ref. 14) for the detection and enumeration of CTCs in metastatic prostate (15), colorectal (16), and breast (17) cancer as an independent prognostic factor. After enrichment using magnetic beads coated with anti-epithelial cell adhesion molecule (EpCAM) antibodies, isolated cells are stained with fluorescently labeled monoclonal antibodies specific for epithelial cells (CK-8/18/19), leukocytes (CD45), and their nuclei with a nuclear staining dye [4′, 6 diamidino 2 phenylindole (DAPI)], and subsequently enumerated by a semiautomated fluorescence microscope.
In addition to enumeration, CTCs can also be isolated for molecular characterization. This may enable insight into the molecular biology of metastasis, the association of their molecular profiles with treatment outcomes, and reveal the presence of potential drugable targets. However, although EpCAM-based enrichment eliminates a large proportion of leukocytes (approximately 4-log depletion), there are still considerable quantities of contaminating leukocytes (DAPI+/CD45+) present after this enrichment (18). This contamination, together with the low frequency of CTCs, forms a challenge when aiming to characterize CTCs by very sensitive molecular methods such as PCR.
Despite these challenges, we have recently shown the feasibility of determining mRNA expression of epithelial-specific genes in CTC-enriched samples (18). In addition to mRNA, another class of RNAs that increasingly attracts attention is the group of miRNAs. Each miRNA targets, on average, 200 mRNA transcripts by which miRNAs execute widespread control (19). As might be expected based on these activities, altered expression of specific miRNA genes has already been shown to contribute to the initiation and progression of cancer (20–22). Therefore, miRNA-based cancer gene therapy offers the theoretical appeal of targeting multiple gene networks that are controlled by a single aberrantly expressed miRNA (23), making the profiling of miRNAs in cancer even more appealing, especially in the context of CTCs.
Here, we describe the optimization of a method to perform both miRNA and mRNA expression analysis for multiple genes by real-time reverse transcriptase PCR (RT-PCR) on as little as 5 CTCs isolated from 7.5 mL of blood, which is considered the clinically relevant cutoff in patients with metastatic breast cancer (24–26), in an environment containing excess quantities of up to 1,000 (18) contaminating leukocytes. As shown in this study for patients with metastatic breast cancer, this robust and novel method allows the simultaneous determination of 65 epithelial tumor cell–specific miRNA and mRNA expression levels in CTCs enriched by CellSearch, and the exploration of their clinical relevance on the basis of the identification of 4 different patient clusters with distinct characteristics.
Materials and Methods
This study was approved by the Erasmus MC and local Institutional Review Boards (METC 2006-248), and all donors and patients gave their written informed consent.
Breast tumor tissues and blood samples
From 61 patients with metastatic breast cancer, 2 × 7.5 mL blood samples were prospectively taken for CTC enumeration and isolation (for details see next) prior to initiation of systemic therapy for metastatic disease. From these 61 samples, 11 (18%) were excluded because of insufficient RNA quality and/or quantity (for details see next), rendering a total number of 50 patients eligible for further analysis. Metastatic breast cancer patients had been included at the start of systemic therapy between February 2008 and December 2009 in 4 hospitals (9 patients in the Erasmus Medical Center, Rotterdam, The Netherlands, 10 in the Ikazia Hospital, Rotterdam, The Netherlands, and 10 in the Maasstad Hospital, Rotterdam, The Netherlands, and 21 patients in the Oncology Center GZA St-Augustinus, Antwerpen, Belgium). For 8 of 32 patients with at least 5 CTCs, primary tumor tissue containing at least 50% invasive epithelial tumor cells was available for RNA isolation [5 fresh frozen (FF) and 3 formalin-fixed paraffin-embedded (FFPE)]. These 8 specimens were used for comparison of transcript levels between CTCs and corresponding primary tumors. Detailed clinicopathological information for these 50 patients and the 8 matching primary tissues is given in Table 1 and in Supplementary Table S1 after dichotomization of patients at the breast cancer clinically relevant level of 5 CTCs (24–26). Fifty-three healthy blood donor (HBD) blood samples were drawn form laboratory volunteers and blood donors of the Sanquin Blood Bank South-west Region.
Enumeration of CTCs
Prior to the administration of first-line systemic therapy, 7.5 mL of blood from HBDs and metastatic breast cancer patients was drawn in CellSave tubes (Veridex LLC). For CTC enumeration, samples were processed on the CellTracks AutoPrep System (Veridex LLC) by using the CellSearch Epithelial Cell Kit (Veridex LLC) and CTC counts were determined on the CellTracks Analyzer (Veridex LLC) according to the manufacturer's instructions and as described previously (27, 28).
miRNA and mRNA isolation from CTCs, FF, and FFPE
For gene expression studies, in parallel with the enumeration studies, 7.5 mL of blood from the same healthy donors and patients was drawn in EDTA tubes and enriched for CTCs on the CellTracks AutoPrep System using the CellSearch Profile Kit (Veridex LLC). RNA isolation was performed with the AllPrep DNA/RNA Micro Kit (Qiagen) according to the manufacturer's instructions. A more detailed description is given in the Supplementary Materials and Methods. Total RNA was isolated from FF tissue with RNA-Bee as described previously (29) and from FFPE tissue with the column-based High Pure RNA Paraffin Kit (Roche Applied Science) according to the manufacturer's instructions.
Stem-loop cDNA synthesis, preamplification, and real-time PCR (quantitative RT-PCR)
The generation of preamplified cDNA from total RNA from the FF and FFPE tissues and the >200 nucleotide (nt) RNA fractions and subsequent TaqMan-based quantitative RT-PCR (qRT-PCR) analysis, and the validation procedures to ensure homogeneous amplification, were performed as described before (18).
To analyze miRNAs, a multiplex stem-loop cDNA approach was used essentially as described before (21). In brief, up to 50 different RT primers (250 nmol/L each) were pooled, concentrated for 60 minutes in a speed vacuum centrifuge at 50°C, and resuspended in nuclease-free ddH2O (double distilled water) to a final concentration of 50 nmol/L each. The use of a specific primer with a hairpin structure during cDNA synthesis and mature miRNA-specific detection probes precluded the detection of precursor miRNAs. A total of 25 to 50 ng of total RNA sample aliquots were reverse-transcribed in a final volume of 20 μL with a final concentration of 12.5 nmol/L for each RT primer using the TaqMan miRNA for reverse transcription kit [Applied Biosystems (ABI)] according to the manufacturer's instructions and as described before (21).
For the miRNA quantification in the CTC samples, 3 μL ≤ 200 nt RNA aliquots were reverse-transcribed in a final volume of 7.5 μL with a final concentration of 12.5 nmol/L for each RT primer (ABI), 0.65 mmol/L of each dNTP (ABI), 3 mmol/L magnesium chloride (Invitrogen), 0.3 U/μL RNase inhibitor (ABI), 15 U/μL RevertAid H Minus enzyme (Fermentas), and 1x RT buffer (Fermentas). Cycling conditions were according to the “Megaplex RT reaction for TaqMan miRNA array” protocol from ABI, i.e., 40 cycles of 16°C for 2 minutes, 42°C for 1 minute, and 50°C for 1 second, followed by a final incubation at 85°C for 5 minutes and a cooldown to 4°C. Prior to PCR, half of the resulting multiplex cDNA was linearly preamplified in 15 cycles according to the manufacturer's instructions (TaqMan PreAmp from ABI) and as described previously for our multiplex gene expression studies (29). Before performing real-time PCRs for each of the miRNAs separately, RT samples were diluted in nuclease-free ddH2O and analyzed by real time.
PCR in a 20-μL reaction volume in an Mx3000P Real-Time PCR System (Stratagene) using the individual TaqMan miRNA primer and probe assays in combination with TaqMan Universal PCR Master Mix No AmpErase UNG (ABI) with cycling conditions according to the manufacturer's instructions.
To verify that the multiplex RT approach did not affect the quantification of specific miRNAs, all miRNA data were validated in a uniplex RT reaction. A pool consisting of RNA of different human breast tissues was included in each cDNA synthesis and preamplification run, and the resulting data were used to normalize for variation between experiments. In addition, all cDNA synthesis runs incorporated a minus RT reaction, which proved to be negative for all assays in this study. PCR efficiency, linearity, and the upper and lower detection limits of each of the individual miRNA assays were validated with a standard curve prepared of RNA from a pool of breast tumors. Negative controls included samples without RT and samples in which total RNA and cDNA was replaced with ddH2O. Quantitative values were obtained from the threshold cycle (Ct) at which the increase in TaqMan probe fluorescent signal associated with an exponential increase of PCR products reached the fixed threshold value of 0.08, which was in all cases at least 10-fold higher than the background signal.
First selection of potentially CTC-specific mRNA and miRNA transcripts
The specifics of the used TaqMan assays are given in Supplementary Table S2A for the miRNAs and Supplementary Table S2B for the mRNAs. For the identification phase of potentially CTC-specific miRNA transcripts, the TaqMan Human MicroRNA Assay Set (Sanger miRBase v10; ABI), consisting of 446 unique assays to quantify 436 miRNAs and 10 controls (small nucleolar RNAs; SNORs/RNUs), was used to screen a pool of 150 primary breast cancer RNAs. Of these 446 miRNAs, 253 were expressed in these breast cancer samples and approximately 200 had an expression level of more than 10% of the expression of the reference miRNA set (see next). Next, these levels were compared with those measured in a pool of 6 CellSearch-enriched preparations from HBDs for potentially differentially expressed miRNAs. These prescreen analyses selected 39 miRNAs with both notable expression in breast tumors and at least a 10-fold higher expression in breast tumors relative to CellSearch enriched HBDs. Four additional miRNAs were included for other reasons, i.e., hsa-miR-452 to compare with hsa-miR-452# and hsa-miR-379 because of the observed difference between estrogen receptor (ER)-positive and ER-negative samples in the prescreen, RNU6B as being a potential reference miRNA, and hsa-miR-210, which has shown clinically relevance in breast cancer (refs. 21, 30; Supplementary Table S2A).
For the mRNA transcripts, clinically relevant and potentially CTC-specific genes were selected in silico on the basis of literature data and their reported low expression in white blood cells and higher expression in breast tumor tissues, according to the SAGE Genie Database of the Cancer Genome Anatomy Project (http://cgap.nci.nih.gov/SAGE/AnatomicViewer). These prescreen analyses were performed as described in detail before (18) and resulted in 90 mRNA transcripts, including 3 reference genes and 2 reference leukocyte markers that could be measured reliably by qRT-PCR and which were potentially higher expressed in breast tumor cells relative to leukocytes (Supplementary Table S2B).
Reference genes, data normalization, and quality control
Unless stated otherwise, levels of HMBS, HPRT1, and GUSB were used to control sample loading and >200 nt RNA quality, as described previously (29). Bone marrow stromal cell antigen 1 (BST1) and protein tyrosine phosphatase receptor type C (PTPRC coding for CD45) were the control genes for leukocyte background and keratin 19 (KRT19) was the control gene for CTC quantification (29).
Although appropriate reference molecules for miRNAs are still unknown for clinical breast cancer cells with a background of leukocytes, previous studies have shown that normalization on mean or median expression of all miRNAs measured in a sample can adequately reduce technical variation (31). Therefore, miRNA data of each individual sample were normalized on the median level of all miRNAs measured in that particular sample.
After verification of equal PCR efficiency for all assays, the relative expression levels were quantified by using the delta Ct method, which is the difference between the median Ct of the appropriate control genes and the Ct of the target gene. Only samples that were at the median Ct of all miRNAs and the median Ct of HMBS, HPRT1, and GUSB able to generate a signal within an arbitrarily chosen cutoff set at 26 Ct were considered of sufficient quality and quantity to enter the study. By the use of this threshold, 11 of our initial 61 patient CTC samples (18%) were excluded from further analysis.
Finally, all transcript data of the 50 CTC samples, 53 HBD controls, and 8 primary tumors were normalized to the Ct of the appropriate reference set, after which, for each individual assay, the median Ct measured in CellSearch-enriched HBDs (n = 31 for the mRNAs and n = 8 for the miRNAs) was used as a cutoff Ct for the CTC samples. All genes with Ct values exceeding this cutoff Ct were considered to be undetectable.
Statistical analysis was done by SPSS 15.0 and Datan Framework GenEx Pro package version 126.96.36.199 software for real-time PCR expression profiling. Grubbs' test was used to define outlier data points (1.1%) that were replaced with the median value of all samples for the gene in question. The strengths of the associations between continuous variables were tested with the nonparametric Spearman rank correlation test (rs). Gene expression levels in the various fractions were compared with the nonparametric Wilcoxon's test to test the null hypothesis and the Mann–Whitney U test to identify genes with significantly different expression levels between groups. A false discovery rate (FDR) control of 10% was applied to correct for multiple testing (32). Cluster analysis (http://rana.lbl.gov/eisen/; ref. 33) was used to cluster the samples on the basis of the gene expression values and TreeView (http://rana.lbl.gov/eisen; ref. 33) was used to visualize the results. DAVID (Database for Annotation, Visualization, and Integrated Discovery, david.abcc.ncifcrf.gov; refs. 34, 35) was used to functionally annotate genes and identify the over-represented functions, with P values corrected for multiple testing via the Benjamini-Hochberg's procedure. All human genes were used to compare frequencies of functions. Unless stated otherwise, all statistical tests were 2-sided with P < 0.05 considered as statistically significant.
Quality control measures taken to ensure reliable measurement of CTC-specific gene transcripts
The first purpose of this study was to establish a sensitive method to perform both mRNA and miRNA expression analysis of transcripts specific for CTCs, in samples often containing only a few CTCs in an environment of excess quantities of contaminating leukocytes.
To select the gene transcripts, we used the approach described in detail in the Materials and Methods section, resulting in 43 putative breast CTC–specific miRNAs, 85 putative breast CTC–specific mRNAs, and 5 control mRNAs. Our first challenge was to find a method that would enable us to measure both mRNAs and miRNAs in RNA isolated from as little as 5 CTCs (approximately 50 pg total RNA), which is considered the clinically relevant cut point in patients with metastatic breast cancer (24–26) in a reliable and quantitative manner. In this respect, as already described and tested for the mRNA assays (18), any individual miRNA expression assay showing as a nonhomogeneously amplified outlier in our tests should be treated with caution because the data may not be truly representative for the original sample. Therefore, our assay had to have a high sensitivity combined with a minimum number of nonhomogeneously amplified miRNA and mRNA assays. To achieve this, we combined the already sensitive multiplex stem-loop cDNA approach with the TaqMan-based linear preamplification method, both from ABI. To validate the sensitivity and linear and homogeneous nature of this combined technique, we performed comparative tests between serially diluted nonamplified and multiplexed preamplified cDNA from total RNA of pooled primary breast tumors, as described before (18). The homogeneity of amplification was set at a cutoff of 2 Ct, i.e., for an assay to be considered homogeneously amplified, the number of cycles that were required after preamplification should be within a 2 Ct range of the number of cycles that were required for the nonamplified material. After adjusting for the median 15.5 Ct gain due to the preamplification procedure, data of 11 miRNA assays were outside this range (Table 2, lower). After testing the 43 miRNAs in a multiplex cDNA PCR reaction in our patient cohort of 50 CTC samples, data of 2 additional miRNAs (hsa-miR-10b and RNU43) had to be discarded because they generated very poor amplification curves. Finally, the PCR efficiency of 2 of the remaining 30 assays was outside our set range of 75% to 125% (hsa-miR-135b, 135%, and hsa-miR-452#, 73%) and these miRs were therefore also excluded from our final analyses (Table 2, column 6).
A summary of the results of these quality control experiments, which left us with 28 potentially breast CTC–specific miRNAs that could be measured reliably after our multiplexed cDNA followed by the preamplification procedure, is listed in Table 2.
Finally, when implementing an assay into clinical diagnostics, it is important that data can be compared in-between qRT-PCR sessions. For our mRNA measurements, we have previously shown that the data are reproducible using the preamplification procedure from ABI (18). To certify that the miRNA data generated with these assays and the multiplex preamplification procedure were also reproducible between different qRT-PCR sessions, a control RNA sample consisting of 300 pg total RNA of a pool of breast tumors was included in each session. The relative expressions (average delta Ct ± 95% CI) of the 28 miRNAs measured in this control sample in 28 independently performed multiplexed preamplified qRT-PCR sessions (Fig. 1) with a median coefficient of variation (CV) at the absolute Ct level of 6%, ranging from 3% for hsa-miR-200a# to 15% for hsa-miR-184, illustrate the robustness of our method.
mRNAs and miRNAs differentially expressed between CTC preparations and leukocytes
The miRNA analyses showed that of the 446 miRNAs investigated, 28 miRNA transcripts could be measured reliably and linearly in a multiplex preamplification reaction with an anticipated more than 10-fold (median 160-fold) higher expression in CTCs relative to blood-derived leukocytes (Table 2 and Supplementary Table S2A). Of these 28 small RNAs, only 1 miRNA (hsa-miR-183) was higher expressed in the 32 samples that contained at least 5 CTCs than the 9 samples without detectable CTCs after the CellSearch procedure. At an FDR of 10%, 9 additional miRNA transcripts were more abundantly expressed in the preparations that contained at least 5 CTCs relative to WB preparations of HBDs prior to (n = 14) or after (n = 8) CellSearch enrichment (Table 3).
For the mRNA transcripts, we used the approach described in detail before (18). Of the thus in silico selected 85 putatively CTC-specific and/or for breast cancer clinically relevant genes (Supplementary Table S2B), 55 were at an FDR of 10% significantly higher expressed in the 32 samples of patients with at least 5 CTCs than 31 CellSearch-enriched HBD samples. A gene expression call rate of 55 of 85 (65%) is within the limits of what can be expected for a profiling study (36). In addition to these 55 mRNA transcripts, another 6 mRNA transcripts were more abundantly present in the 32 samples that contained at least 5 CTCs relative to the 14 WB samples from HBDs prior to CellSearch enrichment. Of the 55 mRNA transcripts, 14 were also more abundantly expressed in the 32 samples with at least 5 CTCs relative to the 9 enriched metastatic breast cancer blood samples without detectable CTCs. Finally, only 6 genes, including the 2 leukocyte control genes PTPRC (CD45) and BST1, were found to be significantly higher expressed in the 31 CellSearch-enriched HBD samples than the 32 patient samples with at least 5 CTCs (Table 3B).
Unsupervised hierarchical clustering to identify clusters of patients according to gene expression patterns
Next, unsupervised 2-dimensional average linkage hierarchical cluster analysis (33) was done to compare the gene expression profiles of our 50 patients. For this, we used the 65 genes (55 mRNA and 10 miRNA transcripts) that were at a 10% FDR more abundantly expressed in CellSearch-enriched fractions of the 32 patients with at least 5 CTCs (Table 3 and Fig. 2).
This analysis resulted in a clustering of 4 groups of patients with a clear discrimination between patient cluster 1 and patient clusters 2 to 4. The median number of counted CTCs for cluster 1 was 1 (range: 0–173) CTC; for cluster 2, 14 (0–138) CTCs; for cluster 3, 41 (0–2, 262) CTCs; and for cluster 4, 74 (0–886) CTCs (Fig. 2).
About the gene clustering, 5 gene clusters with a correlation more than 0.2 could be identified. In the largest 18-gene cluster (gene cluster 1), “signaling” was the most significant common category for 12 genes (MUCL1, FGFR4, FGFR3, ERBB4, CXCL14, PLOD2, PIP, TFF3, FKBP10, IGFBP2, TIMP3, and PLAU) as identified by DAVID (34, 35) analysis (3.9-fold enriched, P = 0.0014). In addition to these signaling genes, this gene cluster contains some potentially interesting drug targets such as ERBB4, FGFR3, and FGFR4.
The second-gene cluster (gene cluster 2, correlation 0.40) is characterized by luminal genes, such as CCND1 (37), ESR1, KRT18 (37), and MUC1, of which MUC1 has previously been used by others for the detection of CTCs in breast cancer (38–42). At an enrichment of 8.0-fold, Benjamini P = 0.008, “mutagenesis site,” i.e., genes with mutational hot spots, was the most significant category identified by DAVID for 6 genes (MUC1, CCND1, KRT18, ESR1, CEP55, and FEN1) in this 7-gene cluster.
One distinct gene cluster (gene cluster 3, correlation 0.35) was responsible for the association with the absence of CTCs, i.e., patient cluster 1. This 14-gene cluster holds in addition to the previously identified CTC-specific genes KRT19, AGR2, S100A16, and KRT7, and as could be expected TACSTD1, the gene encoding EpCAM, the antigen that was used to enrich for CTCs, also the miRNAs hsa-miR-452 and hsa-miR-34a.
Notably, the miRNA-cluster (gene cluster 4, correlation 0.20) containing hsa-miR-183, hsa-miR-184, hsa-miR-379, and hsa-miR-424 shows an expression pattern that seems to be inversely related to the “mutagenesis” gene cluster 2—which includes ESR1-, the gene that encodes for the ER. This suggests that these miRNAs might be negatively regulated by ER or, vice versa, that these miRNAs negatively regulate ER.
Although no specific category was identified by DAVID as significantly enriched in the last cluster (gene cluster 5, correlation 0.20), this cluster seems to be dominated by genes associated with cell-cycle progression and proliferation such as DUSP4 (MKP2; ref. 43), KIF11, KPNA2, and MKI67. Interestingly, a putative stem cell marker (ITGA6; ref. 44) is also included in this last cluster.
To ascertain that the signals we generated were indeed tumor CTC specific, we also performed a clustering analysis with inclusion of the 14 full blood HBDs (FB-HBD) from which we had data from both the mRNAs and miRNAs (Supplementary Fig. S1). These HBDs (marked in green below the cluster) indeed clustered closely together. Also, the patients from patient cluster 1 (Fig. 2, and marked in red below the cluster diagram in Supplementary Fig. S1), which were characterized by the lack of expression of epithelial marker genes, remain clustered together, next to the HBD cluster.
To further validate that our identified 65-gene expression profile is able to clearly discriminate between signals derived from leukocytes that remain after CellSearch enrichment and signals derived from epithelial cells, we performed a proof-of-principle spiking experiment. For this, gene expression profiles of cells from 4 different breast cancer cell lines were compared with those of HBD samples of 5 different healthy volunteers and an HBD sample of a healthy volunteer in which the RNAs of the 4 different tumor cell lines were spiked in a final quantity equivalent to approximately 1 CTC (approximately 10 pg) per 1.5 mL blood. As can be appreciated from Supplementary Figure S2, a clear distinction can be seen between mixed and unmixed HBD and cell line samples. More importantly, no clear distinction can be seen between the final expression data by using RNA of the cell lines and the cell lines mixed with RNA from HBD.
These data point to a lack of contribution of the leukocytes to the overall gene expression results and confirm that our molecular CTC profile is indeed able to discriminate between signals from leukocytes and epithelial-specific signals from CTCs.
Associations of the CTC molecular profile with primary tumor characteristics
For the association of the molecular profile with primary tumor characteristics, we continued with the 36 patients in patient clusters 2 to 4. These patients displayed a molecular CTC profile with very distinct patterns from the 14 patients in patient cluster 1, which were characterized by the lack of expression of epithelial marker genes. Detailed clinicopathological information of our patient cohort, subdivided in 2 groups (patient cluster 1 versus clusters 2 to 4) on the basis of our molecular CTC–specific profile, is given in Table 1. There were no differences between both groups in terms of nodal status, tumor size, histological tumor type, grade, ER, PR, and HER2 status. The only significant association with clinical information was that the patients of clusters 2 to 4 displayed a 2-fold higher rate of having both visceral and nonvisceral metastases, as opposed to only visceral or nonvisceral metastasis for the patients of cluster 1.
Almost identical results were obtained when the associations of primary tumor and patient characteristics were studied on the basis of CTC count subdivided in 2 groups (patients with less than 5 CTCs versus patients with at least 5 CTCs; Supplementary Table S1).
Associations of gene transcripts measured in CTCs with current drug targets
Although we could not measure PGR transcripts reliably in the CTCs due to the relatively high PGR levels present in the contaminating leukocytes, we could measure ESR1 and ERBB2 mRNA transcript levels, the genes for ER and HER2, respectively, in the CellSearch-enriched CTCs. ESR1 and ERBB2 expression levels measured in the 36 patients from clusters 2 to 4 with expression of epithelial marker genes and compared with ER and HER2 status of the primary tumor as assessed by routine pathological immunohistochemical procedures (with additional FISH for the HER2++ cases), respectively, are shown in Figure 3.
Comparison of gene profiles measured in the CTCs and corresponding primary tumors of metastatic breast cancer patients
We could retrieve 8 primary tumor tissues (3× FFPE and 5× FF) of our cohort of patients with at least 5 CTCs at the time of metastatic disease (median:174, range 7–2, 262 CTCs). We measured the 65 genes of our mRNA and miRNA panel in these tissues after adjusting levels measured in FFPE to those measured in FF.
From the unsupervised average linkage correlation clustering (Fig. 4), it became clear that most CTC samples clustered well with the corresponding primary tumor tissue (T) and that the clustering was not dependent on the origin of the primary tissue (FF or FFPE).
In this study, we describe a robust method to simultaneously determine the expression of 65 epithelial tumor cell–specific miRNA and mRNA expression levels in CTCs enriched by CellSearch. The rationale of our study using the CellSearch technique as a starting point was to develop a simple PCR-based molecular characterization that can be performed on material obtained in a clinical setting. Because the CellSearch method is currently the only FDA-approved semi-automated method to capture CTCs, taking CellSearch-enriched CTCs as a starting point for our method will enable its implementation in clinical studies and broadens its application possibilities. However, although the EpCAM-based enrichment employed by the CellSearch technique eliminates a large proportion of leukocytes (approximately 4-log depletion), there are still considerable quantities of leukocytes present after this enrichment (18). This remaining leukocyte contamination, together with the low frequency of CTCs, forms a challenge when aiming to characterize CTCs by the expression of multiple genes. Despite these challenges, our data indicate that we have succeeded to measure true epithelial tumor cell–specific genes in CTCs with our CTC-specific 65-gene panel, and managed to avoid generation of a predominant leukocyte–derived signal. First, by only selecting genes highly expressed in breast cancer samples and not, or at a much lower level, in blood from HBDs. Second, by validating the true epithelial-specific expression with clustering analyses, which showed that based on the expression of the 65 genes of our molecular profile, the HBDs and breast cancer patients without detectable CTCs clustered closely together and could be clearly separated from the breast cancer patients with detectable CTC numbers (Fig. 2 and Supplementary Fig. S4). In addition, after using our 65-gene profile, most CTC samples clustered well with the corresponding primary tumor tissue (Fig. 4). Finally, as a proof of principle, we showed that profiling with our 65-gene panel before and after spiking RNA of HBDs with RNA from 4 different cell lines in a final quantity equivalent to approximately 1 CTC per 1.5 mL blood clearly separated the mixed and unmixed cells (Supplementary Fig. S2). These data confirmed that our molecular 65-gene profile is indeed able to discriminate between signals from leukocytes and epithelial-specific signals from CTCs.
On the basis of the expression levels of this 65-gene profile, we could identify 4 different patient clusters characterized by 5 distinct gene clusters (Fig. 2). One distinct 14-gene cluster (gene cluster 3) was responsible for the association with the absence of CTCs. To further appreciate the strength of our 65-gene profile in relation to CTC count, it should be noted that the CTC counts were derived from 1 of the 2 aliquots of 7.5 mL blood samples that were processed with the CellSearch Epithelial Kit, whereas the other aliquot used for the molecular profiling was processed with the CellSearch Profiling Kit. This inevitably introduced stochastic variation between the tumor cell content in the 2 aliquots, which is more profound in the lower range of CTC counts. Discussion has also started about the actual number of isolated CTCs differing between the enumeration and profiling kit (45). The given cell counts could therefore only be used as a rough estimate for our molecular profile.
Nevertheless, with 14 of 55 mRNAs (25.4%) and only 1 (hsa-miR-183) of 28 miRNAs (3.6%) higher expressed in the 32 samples that contained at least 5 counted CTCs compared with the 9 samples without detectable CTCs after the CellSearch enrichment procedure with the Epithelial Kit, it seems to be easier to discriminate between CTC-specific and leukocyte-derived mRNAs than between CTC-specific and leukocyte-derived miRNAs. Possibly, the detected miRNA transcripts were derived from cell fragments present in the blood of cancer patients without detectable intact CTCs. The fact that we could measure them might be associated with the remarkable stability of miRNA transcripts in blood (46). Indeed, the detection of an additional 9 of 28 (32.1%) miRNAs that were higher expressed in breast cancer patients without detectable CTCs than in WB preparations of HBDs prior to (n = 14) or after (n = 8) CellSearch enrichment, compared with an additional 6 of 55 (10.9%) for mRNAs, further supports this thought. For these reasons, we felt confident to continue our analyses with those samples that did contain CTCs according to our molecular profile (patient clusters 2 to 4 in Figure 2), irrespective of the CTC count in the blood sample that was processed in parallel with the CellSearch Epithelial Kit.
To show the potential clinical utility of measuring these 65 marker genes in CTCs, we had a further look in the data we generated with our molecular profiling on the levels of 2 well-known genes in breast cancer, ER and HER2 (Fig. 3). For 1 of the patients whose primary tumor was assessed to be ER negative, a clearly positive ESR1 signal was detected in the CTCs (CTC087 in Figure 2) obtained at the time of metastatic disease 7 years after surgical removal of the primary tumor. However, and perhaps even more disturbing, in 11 of 30 patients (37%) whose primary tumor was ER-positive, no detectable ESR1 transcript levels were measured in the CTCs obtained 1 to 149 months after primary surgery. Thus, although according to the primary tumor characteristics, these patients would have an indication for antihormonal treatment, no benefit might be expected from this therapy on the basis of these CTC characteristics. However, due to the limited number of 4 of these 11 patients that were actually treated with antihormonal treatment, no conclusion can be drawn yet on the efficacy of hormonal treatment in these patients with ESR1-negative CTCs and ER-positive primary tumors. Of note in this respect is that half the patients with relatively high CTC-associated ESR1 levels expressed relatively low levels of TFF1 (Fig. 2). TFF1 is a gene under the control of ER. Perhaps, assessment of simultaneous TFF1 expression in CTCs might be able to identify a subset of patients with ER-positive CTCs with functionally active ER, which is more likely to respond to hormonal treatment (47).
Similarly, the CTCs of at least 4 patients with HER2-negative primary tumors showed to be positive at the time of metastatic disease, whereas in 2 patients with an HER2-positive primary tumor, no detectable ERBB2 mRNA could be measured in their CTCs. For those 4 patients with ERBB2-positive CTCs, anti-HER2 therapy is not indicated on the basis of primary tumor characteristics, whereas this treatment could nonetheless be beneficial based on their CTC characteristics.
No clinically relevant cut point has yet been established for ER and HER2 measured by qRT-PCR in CTCs. Nevertheless, such discrepancies between the levels of ER and HER2 measured in the primary tumor and metastases and CTCs have been described before at both the mRNA and the protein level (40, 45, 48–51), indicating that the findings with our multigene measuring technique may indeed be relevant, not only for ER and HER2 but also for the other markers included in our panel.
After clustering CTCs and primary tumors based on the expression of all 65 genes, the only obvious discrepancy we observed between the CTCs and the corresponding primary tumors of 8 different patients concerned patient 2. With 2,262 CTCs, this was the patient with the highest number of CTCs, and thus with an expected negligible effect of the presence of contaminating leukocytes in the expression analysis. The primary tumor of this patient was originally assessed as lobular, low-grade, pT2, ER-positive, PR-positive, and HER2-negative. Such a lobular tumor, with scattered epithelial cell clusters, and associated contaminating RNA from many stromal cells (52), may have contributed to this poor correlation with the expression profile of the high number of CTCs.
Although the high degree of homology in the gene expression profiles of CTCs and corresponding primary tumors was reassuring, discrepancies in expression of individual genes, such as for ESR1 in patients 5, 6, and 8 (Fig. 4), were detected. Another example in this respect is patient 8, from whom the CTCs expressed much higher levels of markers associated with cell-cycle progression and proliferation such as DTL, KIF11, KPNA2, KIF11, and MKI67 than the primary tumor (Fig. 4). Such differences between the primary tumor and CTCs isolated at the time of metastatic disease might prove clinically relevant and thus deserve further research.
In summary, by excluding genes with a relatively higher expression in leukocytes, our CTC-specific 65-gene set, consisting of 55 mRNAs and 10 miRNAs, is able to generate a huge amount of highly relevant CTC-specific data, even in the presence of a leukocyte background signal derived of leukocytes cocaptured with CTCs when using the CellSearch procedure.
Although assessed in a relatively small series, we found discrepancies in several important factors such as ER, HER2, and other genes between primary tumor tissue and CTCs. This is not surprising given the time elapsing between primary tumor resection and CTC collection, which occurred at the diagnosis of metastatic disease, and the fact that several patients received prior adjuvant systemic therapy. The discrepancies in molecular characteristics between primary tumor tissue and CTCs clearly stress the importance of further studies on molecular characterization of CTCs.
Disclosure of Potential Conflicts of Interest
The salaries and bench fee for this collaborative work was covered in part by Veridex. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
This study was in part financially supported by the Netherlands Genomic Initiative (NGI)/Netherlands Organization for Scientific Research (NWO).
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
We especially thank the patients for their willingness to participate and the surgeons, pathologists, and medical oncologists for their assistance in collecting samples and patient's clinical follow-up data.
Note: Supplementary data for this article are available at Clinical Cancer Research Online (http://clincancerres.aacrjournals.org/).
- Received February 1, 2011.
- Revision received March 13, 2011.
- Accepted April 11, 2011.
- ©2011 American Association for Cancer Research.