Background: Cancers of the urinary bladder are the fifth most commonly diagnosed malignancy in the United States. Early clinical diagnosis of bladder cancer remains a major challenge, and the development of noninvasive methods for detection and surveillance is desirable for both patients and health care providers.
Approach: To identify urinary proteins with potential clinical utility, we enriched and profiled the glycoprotein component of urine samples by using a dual-lectin affinity chromatography and liquid chromatography/tandem mass spectrometry platform.
Results: From a primary sample set obtained from 54 cancer patients and 46 controls, a total of 265 distinct glycoproteins were identified with high confidence, and changes in glycoprotein abundance between groups were quantified by a label-free spectral counting method. Validation of candidate biomarker alpha-1-antitrypsin (A1AT) for disease association was done on an independent set of 70 samples (35 cancer cases) by using an ELISA. Increased levels of urinary A1AT glycoprotein were indicative of the presence of bladder cancer (P < 0.0001) and augmented voided urine cytology results. A1AT detection classified bladder cancer patients with a sensitivity of 74% and specificity of 80%.
Summary: The described strategy can enable higher resolution profiling of the proteome in biological fluids by reducing complexity. Application of glycoprotein enrichment provided novel candidates for further investigation as biomarkers for the noninvasive detection of bladder cancer. Clin Cancer Res; 17(10); 3349–59. ©2011 AACR.
The long-term survival of patients with bladder cancer is influenced significantly by the detection of early-stage disease. Accurate biomarkers that can be applied to noninvasively obtained urine sampling would be ideal for not only detection but also surveillance and asymptomatic screening. In this study, we established a comprehensive proteomics strategy for discovery of urinary glycoproteins associated with bladder cancer by using dual-lectin affinity enrichment and liquid chromatography/tandem mass spectrometry analysis. Comparative profile analysis and validation with an orthologous technique confirmed that differential urinary glycoprotein profiles exist that can identify bladder cancer–bearing patients from individuals with no evidence of bladder cancer.
With an estimated 70,980 newly diagnosed cases and 14,330 associated deaths in 2009, urinary bladder cancer is the second most common genitourinary malignant disease in the United States and among the 5 most common malignancies worldwide (1). Transitional cell carcinoma, the most prevalent subtype, accounts for 90% of bladder cancers, whereas 2 other forms, squamous cell carcinomas and adenocarcinomas, constitute 5% and 2%, respectively (2). When detected early, the 5-year survival rate is approximately 94%, thus timely intervention dramatically increases patient outcome. At presentation, more than 80% of bladder tumors are nonmuscle invasive papillary tumors (Ta or T1), and these superficial lesions are treated conservatively by transurethral resection. However, more than 70% of patients with these lesions will have disease recurrence during the first 2 years. If left untreated, these initially superficial lesions can progress to being muscle invasive, thus patients are under continued surveillance by expensive and invasive cystoscopic examination for early detection of new tumor developments.
Voided urine cytology (VUC) remains the method of choice for the noninvasive detection of bladder cancer. This method microscopically examines the morphology of the cells of the bladder lining which are collected from the urine. The method is subjective and is open to considerable interobserver variation and so accuracy is a problem, especially for low-grade and low-stage tumors. Furthermore, results are not available rapidly, it is prone to interobserver variation, and it is relatively expensive. However, the diagnostic accuracy of VUC is limited (3, 4). Although VUC has specificities up to 98% (5), its utility as a primary detection tool is reduced because of sensitivities ranging from 40% to 76% (6). In a recent paper, we found the sensitivity of urinary cytology to be 33% (7). The major problem with urine cytology is the fact that results are operator dependent, meaning that the skill of the cytologist can result in variable results across institutions (8). This variation in accuracy is reflected in the fact that VUC is rarely the single test done during a clinical workup to investigate potential bladder cancers (9).
A number of molecular tests have been developed to detect putative tumor-associated molecules that are released into the urine from malignant lesions. Tests include the bladder tumor antigen (10), nuclear matrix protein 22 (11, 12), BLCA-4, cytokeratin (13), psoriasin (S100A7; ref. 14), fibronectin/degradation products (15, 16), zinc-alpha2-glycoprotein (17). However, these markers have limited sensitivity, so alone they are not yet sufficient to replace the gold standards of urine cytology or cystoscopy (18, 19). The search for more sensitive, noninvasive urine tests for more reliable detection and surveillance is warranted.
Proteomic profiling approaches provide an unbiased and systematic survey of the proteome, and technologies continue to be developed to address problems of biological variation and proteome complexity. In our own work, we have focused on analyzing the glycoprotein component of the proteome. As much as half of the mammalian proteome is composed of glycoproteins that are involved in essential cellular and biological functions, and many of the biomarkers currently used for cancer detection in body fluids are glycoproteins, including PSA (20), and CA-125 (21). Analytical tools for glycoprotein separation have been developed based on chemical reaction, gel-binding, and lectin affinity chromatography. The latter has become a powerful tool for the purification and concentration of glycoproteins. Lectins constitute a group of proteins with unique affinities for carbohydrate structures. Widely used lectins include concanavalin A (ConA) and wheat germ agglutinin (WGA). ConA recognizes alpha-linked mannose, which is very common in N-linked glycans, and WGA selectively binds to N-acetyl-glucosamine (GlcNAc) groups and sialic acid residues.
The aim of this study was to design and apply a comprehensive strategy for the identification of glycoprotein biomarkers in urine for potential utility in bladder cancer detection. We employed dual-lectin affinity (ConA and WGA) chromatography to isolate a wide range of N-linked glycoproteins from urine samples and incorporated a LC/MS-MS–based label-free shotgun method to identify glycoproteins that were differentially expressed according to disease status. One of the most discriminatory novel candidate biomarkers, alpha-1-antitrypsin (A1AT), was validated in an independent sample cohort by using ELISA. The strategy identified specific glycoprotein biomarkers with potential utility in noninvasive bladder cancer detection.
Materials and Methods
Clinical sampling and processing
Under Institutional Review Board approval and informed consent, urothelial samples and associated clinical information were prospectively collected from individuals with no previous history of urothelial carcinoma. Patients were undergoing complete hematuria workup, including office cystoscopy and upper tract imaging, by computed tomography of the abdomen and pelvis (without and with intravenous contrast). Urine cytology was done by standard Papanicolaou staining. All slides were evaluated by a cytopathologist. Urothelial carcinoma grading and staging were done according to the World Health Organization criteria. Two different clinical cohorts were analyzed in this study. The first group (experimental) consisted of 46 subjects with either a negative evaluation (i.e., normal imaging of upper tract and normal cystoscopy) or asymptomatic donors, and 54 subjects with a visible bladder tumor detected by upper tract imaging and/or cystoscopy, and which was later proven by evaluation of a biopsy to be urothelial carcinoma. The second group (validation) comprised 35 subjects with a negative evaluation and 35 cases of confirmed bladder cancer. A summary of clinical data is given in Table 1. Each sample consisted of 30 to 50 mL of midstream urine collected in a sterile cup, stored immediately at 4°C, and processed for storage within 1 hour of collection. Specimens with clearly visible hematuria were excluded from analysis. Urinary creatinine, blood, leukocytes, nitrite, glucose, pH, ketone, and bilirubin were measured for all samples before processing by using MULTISTIX PRO Reagent Strips (Bayer HealthCare). Cells and debris were removed from each urine sample by centrifugation at 500 × g and 5,000 × g for 5 minutes and 10 minutes at 4°C, respectively, and supernatant was stored at −80°C until processing for proteomic analysis. For chromatography, supernatant was subjected to total protein precipitation by adding 4 times the sample volumes of cold acetone (−20°C). We found by testing replicate samples that cold acetone gave a yield of approximately 90% protein recovery. The sample was left at −20°C overnight followed by centrifugation at 12,000 × g for 15 minutes at 4°C. The supernatant was removed, and the centrifuge tube was left open in the fume hood to remove the remaining solvent. The pellet was resuspended in binding buffer (20 mmol/L Tris, 0.15 mol/L NaCl, 1 mmol/L MnCl2, and 1 mmol/L CaCl2, pH 7.4) and vortexed vigorously to completely dissolve the pellet. The protein concentration in the sample was determined by the Bradford protein assay (Bio-Rad). Throughout the study, all samples were processed and analyzed individually; no pooling of samples was done.
ConA and WGA lectin affinity chromatography
A 700 μL resin slurry of agarose-bound ConA and 800 μL resin slurry of WGA (Vector Laboratories) was packed into a 2-mL disposable centrifuge column (Thermo Scientific). The resin was first washed 3 times by binding buffer. An amount of 450 μg diluted protein sample (1:4) was then added to the column and incubated for 30 minutes. The flow through was discarded. The nonspecific binding was removed by washing resin 4 times with 1 mL of binding buffer each time. One milliliter of elution buffer (20 mmol/L Tris, 0.5 mol/L NaCl, 1 mmol/L MnCl2, 1 mmol/L CaCl2, 0.3 mol/L methyl-α-d-mannopyroside, and 0.3 mol/L N-acetyl-d-glucosamine, pH 7.0) was added and mixed for 10 minutes. The eluate was collected and the elution step was repeated once.
Tryptic digestion/PNGase F treatment
The protein sample eluates were concentrated by using a Microcon YM-10 column (Millipore Corp.,) according to the manufacturer's protocol. Approximately 10 μL of sample was obtained from the filter device. A 1 μL portion of 100 mmol/L DTT (Sigma) was added and the resulting mixture was incubated at 45°C for 30 minutes. A 1 μL portion of trypsin (Sigma) was added afterwards at 37°C. The overnight digestion and reduction reaction was terminated by adding 0.1 μL of TFA to the digest. The tryptic digestion mixture was completely dried by using a SpeedVac concentrator (Labconco Corp.,) operated at 45°C. The digest was reconstituted with 20 μL of 50 mmol/L ammonium bicarbonate and 5 unit of PNGase F (Sigma) was added. The deglycosylation reaction was incubated for 24 hours in a water bath set at 37°C. The reaction was stopped by incubating the digest mixture at 75°C for 20 minutes. The sample was dried by SpeedVac concentrator (22).
Nano-reverse phase liquid chromatography-electrospray mass spectrometry
A Paradigm MG4 micropump (Michrom Biosciences Inc.,) was used to deliver the mobile phase. The pump flow rate was split to achieve a column flow rate of 300 nL/min. The separation column (0.1 × 150 mm2, C18 AQ particles, 5 μm, 120 Å) was from Michrom BioResources. The resolved peptides were analyzed on a linear ion trap mass spectrometer with a nano-LC-ESI source (LTQ; Thermo Finnigan). A mobile phase system of 2 solvents was used, in which solvents A and B were composed of 0.1% formic acid and 2% acetonitrile or water in high-performance liquid chromatography (HPLC)-grade water and acetonitrile, respectively. A 60-minute acetonitrile/water gradient method was used, starting with 5% acetonitrile which was ramped to 40% in 55 minutes and to 95% in another 5 minutes. The LTQ instrument was operated in positive ion mode. The capillary transfer tube was set at 200°C, the ESI spray voltage at 2.5 kV, and the capillary voltage at 30 V. Ion activation was achieved by utilizing helium at a normalized collision energy of 35%. The data acquisition and generation of peak list files were automatically done by Xcaliber software (184.108.40.2069). For each cycle of 1 full mass scan (range of m/z 400–2,000), the 3 most intense ions in the spectrum were selected for tandem MS analysis, unless they appeared in the dynamic or mass exclusion lists.
Database queries and manual validation
All MS/MS spectra were analyzed by using the SEQUEST algorithm, version 27 incorporated in Bioworks software, version 3.1 SR1 (Thermo Finnigan). Peptide fragment lists were generated and submitted to the Swiss-Prot database. The search parameters were as follows: (i) protein database, Uniprot/Swiss-Prot Release version 54.6 of 17-Dec-2007, downloaded from NCBI; (ii) allowing 2 missed cleavages; (iii) variable modification, oxidation of M; N+1, the +1 Da was allowed owing to hydrolysis of the amide of the asparagine side chain to release the asparagine-linked oligosaccharides from glycopeptides; (iv) peptide ion mass tolerance 1.50 Da; (v) fragment ion mass tolerance 0.0 Da; (vi) peptide charges +1, +2, and +3. To further validate data obtained from SEQUEST, Trans-Proteomic Pipeline (TPP) software was used in which the threshold score was set at 0.9 for ProteinProphet probability (overall false positive rate below 1%; refs. 23, 24). Only the proteins that passed the threshold and contained NXS/T motif in their peptide sequences were sorted as glycoproteins.
Data normalization and spectral counting quantification
Spectral counts were parsed from TPP xml files after processing the SEQUEST data. Global normalization with protein length was used to reduce technical bias when acquiring spectral count data from different runs between and across samples. It is important to take into consideration the length of a protein when determining protein abundances by using spectral counting, because small proteins tend to have fewer peptides identified per protein. If Cij is the raw spectral count for protein i in subject j, and L i is the length of protein i, then the first step in data processing is to calculate Xij by using the formula which adjusts for the total spectral count observed per sample and for the effects of protein length.
To eliminate the discontinuity in the count ratios when the spectral count for a protein is zero, the Xij data were transformed following an approach similar to that used in Old and colleagues (25), as originally proposed by Beissbarth and colleagues (26) for serial analysis of gene expression (SAGE). The transformation uses the logarithmic quantity in which Tj is the total of the Xij for subject j, and f is a positive constant. The value of f was set as the maximizer of the average Pearson correlation coefficient between pairs of replicate measures for the same subject, which was determined to be 0.7 (Supplementary Fig. S6). Finally, the data were filtered based on a minimum SD threshold of 0.01, to exclude proteins with very low variability between cancer and normal groups.
For Western blot analyses, urine samples from 3 individual patients from each independent experimental group (noncancer and bladder cancer) were applied. An amount of 20 μg of protein was resolved on 4% to 20% Tris-SDS gels and transferred to polyvinylidene difluoride membranes. Membranes were blocked with 1% bovine serum albumin in PBST for 2 hours at room temperature and incubated with primary antibody (mouse alpha-1-antitrypsin, mouse uromodulin from Abcam, and rabbit cell adhesion molecule 1 from Abnova) overnight at 4°C. All primary antibodies were diluted 1:1,000. After three 10-minute washes with PBST, membranes were incubated with anti-mouse/anti-rabbit IgG-horseradish peroxidase secondary antibody (Invitrogen) at a dilution of 1:5,000 for 1 hour, washed and visualized with the SuperSignal West Pico Chemiluminescent Substrate System (Pierce). The average time for image development was 15 seconds. Chemiluminescence was scanned on a LAS-3000 instrument using LAS-3000 Lite software, and scanned images were visualized and quantified by using MultiGauge v.3.1 software (FujiFilm Life Science, Inc.,).
The level of human A1AT was monitored in urine samples by using the AssayMax Human A1AT ELISA kit (Assaypro). The A1AT present in standards and samples is measured by competition with a biotinylated A1AT and detected by using a streptavidin–peroxidase conjugate. Twenty-five microliters of A1AT standard or urine sample was added to each well, followed by addition of 25 μL of biotinylated A1AT. The assay was conducted according to the manufacturer's instructions. The absorbance values were read on a microplate reader (BioTek, Synergy HT, VT) at a wavelength of 450 nm.
Statistics and data analysis
For the differential expression analysis of profile data, we used linear mixed effects models to identify proteins that were differentially expressed between cancer and normal samples while reducing the effects of experimental batches. The mixed effects models used transformed glycoprotein abundance as the outcome variable and included random intercepts for the batches and a fixed effect for disease status. The Z-score for the disease status variable was used to identify differentially expressed proteins, based on the false discovery rate (FDR) values and the Bonferroni-adjusted P values. For ELISA data analysis, a standard curve prepared by using purified A1AT was generated by regression analysis by using 4-parameter logistic curve fit. Signal intensities of the A1AT disease marker were converted to concentration values by reference to the standard curve. To normalize for unknown total urine volume, these values were reported as a ratio relative to urinary creatinine values (27–29). ELISA data were plotted as a scatter plot and a receiver operating characteristic (ROC) curve. The Student's t test for paired samples was used to conduct pairwise comparison of area under the curve (AUC) for A1AT signal intensities calibrated by using urinary creatinine. A ROC curve is obtained by varying a decision threshold and provides a direct view on how a predictive approach performs at different sensitivity and specificity levels (30, 31). In the case of a diagonal line on the ROC curve, an AUC value of 0.5 is obtained, indicating that prediction between positive and negative cases is random. The other end of the scale is represented by an AUC value of 1.0, indicating perfect separation. The AUC represents an overall measure of a particular discriminatory potential of biomarker across all thresholds. The ROC curve also allows the relationship between sensitivity and specificity to be estimated at any given point. Here, sensitivity (or true-positive rate) is the probability that a subject with bladder cancer is correctly classified as having cancer, and specificity (or false-positive rate) is the probability that a subject with bladder cancer is incorrectly classified as having cancer (specificity can be defined as 1 minus the false-positive rate). Differences in all analyses were regarded as significant at a P < 0.05. Statistical analyses were done by SPSS 13.0. and by MedCalc version 8.0 (MedCalc Software) for ROC curve analysis.
Identification of urinary glycoproteins by liquid chromatography/tandem mass spectrometry
Glycoproteins from 54 bladder cancer and 46 noncancer specimens were enriched by dual lectins, ConA and WGA, and digested with trypsin (see workflow diagram Fig. 1). After N-glycans were removed from tryptic peptide mixtures by using PNGase F, they were analyzed by nanoHPLC-ESI-MS/MS by using LTQ mass spectrometry and TPP software to identify glycosylated proteins. Supplementary Figure S1 shows a typical MS/MS spectrum of an N-glycosylated peptide from a glycoprotein, A1AT, identified in one cancer urine sample. After nonspecific binding proteins and any processing artifacts were removed (such as trypsin and keratin), a total of 421 urinary proteins were identified with high confidence (ProteinProphet probability scores ≥ 0.9 and a false-positive rate below 1%; ref. 32). After manual validation by using the Swiss-Prot database, we identified a total of 267 distinct glycoproteins. To further exclude proteins with low variability between cancer and noncancer groups, we compiled a list of 100 unique glycoproteins with a variance standard (SD) more than 0.01. A heat map was prepared to visualize the abundance of these glycoproteins across all specimens in the 2 groups (see Supplementary Fig. S2).
Assessment of reproducibility and correlation of liquid chromatography/tandem mass spectrometry data
To assess the technical variation resulting from the nano liquid chromatography/tandem mass spectrometry (nanoLC/MS-MS) instrument, the transformed spectral counting data from technical replicates were compared. This comparison revealed that the same proteins from duplicate runs exhibited nearly identical retention time, confirming a high level of reproducibility for the LTQ instrument. An example comparison is depicted in Supplementary Figure S3. Scatter plots were constructed to display the LTQ reproducibility of 2 consecutive analyses across all of the urinary samples from the noncancer and the cancer groups (Fig. 2). The linear regression coefficient r 2 was 0.91 and 0.82 for control and cancer samples, respectively, indicating that reproducibility was high across the entire sample set. To further evaluate MS data reliability and variance among different biological samples and batches, we investigated the Pearson correlation between transformed data for all proteins, within pairs of samples. A batch analysis scatter plot (Supplementary Fig. S4) shows that the technical replicates were highly correlated to each other, confirming our LTQ reproducibility findings. When compared within a batch, the N/N pairs (noncancer) and the T/T pairs (tumor bearing) are slightly better correlated than the N/T pairs, reflecting a reasonable biological variation relationship between noncancer and cancer specimen groups. The 3 pair-type comparisons are similar when compared either across different batches or within-batches. These findings show that no major LTQ analysis batch effect was evident. However, we did observe some batch variance when comparing protein extraction data (Supplementary Fig. S5). Therefore, our subsequent differential analyses were based on within-batch comparisons and the significance level was evaluated by using a statistical model incorporating a batch factor.
Identification of differentially expressed glycoproteins
To find urinary glycoproteins that exhibit differential abundance between the noncancer and cancer groups, MS data were analyzed by a label-free quantification method based on spectral counts. After protein sequence normalization and variance standard application, the quantity of each protein was transformed into a log scale. The correction factor f was determined for technical replicates to achieve the maximal correlation value (Supplementary Fig. S6). After transformation, statistical significance levels between the 2 groups were analyzed by multiple comparisons adjustment by using a stringent local FDR (locFDR) and considering the batch factor. To be qualified as a potential biomarker, a glycoprotein must be present in at least half of the cancer or noncancer samples. As a result, a list of glycoproteins was found to be significantly differentially expressed with respect to the presence of bladder cancer, with a FDR value less than 0.1 (Table 2). Three of the glycoproteins were expressed at significantly higher levels in urine from bladder cancer patients. A heat map was constructed to display the transformed spectral counts of these differentially expressed glycoproteins for all samples (Fig. 3). Consistent with the results of spectral counting, analysis of 6 randomly selected urine samples (3 noncancer and 3 bladder cancer cases) by Western blotting showed that the levels of A1AT were elevated in the bladder cancer cases whereas UROM, CADM1 were found to decrease (data not shown).
Validation of urinary A1AT abundance by ELISA
To verify the observed differential levels of the A1AT glycoprotein in bladder cancer patient samples, we measured the abundance of this target in urine samples obtained from an independent cohort (35 bladder cancer cases and 35 noncancer controls) by using a commercially available ELISA. Analyses revealed a significant association of elevated A1AT in urine samples obtained from patients with bladder cancer (Fig. 4). The differential levels between groups was statistically significant (P < 0.0001), and construction of an ROC curve illustrates the ability of A1AT to distinguish cancer and noncancer cases (Fig. 4). The area under the ROC curve (AUROC) was 0.82. From the ROC plot, we can estimate the relationship between sensitivity and specificity of the assay at any given point. At a sensitivity of 74%, A1AT detection classified bladder cancer patients with a specificity of 80%. These values compare favorably with the gold standard of urine cytology, which has high specificity but a range of sensitivity from 11% to 76% (3, 4). In the validation cohort, 11 of 35 (31%) of the cancer cases were positive by cytology. If we set arbitrary thresholds for the pilot A1AT ELISA data, we can estimate whether the ELISA data would have augmented the cytology data in some cases. Using a threshold of 3.2 from the data illustrated in Figure 4, the ELISA assay detected 10 cancer cases and no false positives. Of these 10 confirmed cancer cases, 7 had a negative cytology.
Two-dimensional electrophoresis (2-DE) of proteins has been the conventional method for biomarker assessment in urological proteomics (33, 34). Irmak and colleagues identified 2 proteins, orosomucoid and zinc-alpha glycoprotein which were increased in urine samples of tumor-bearing patients in comparison with samples from a few healthy volunteers (17), and Pinero and colleagues used 2D-DIGE coupled with mass spectrometry to identify regenerating protein-1 and keratin 10 as being associated with bladder cancer (35). Saito and colleagues (36) used a 2-D PAGE approach but focused specifically on proteins of the extracellular matrix and matrix metalloproteinases (MMP) in urine. Samples enriched with gelatin-affinity beads revealed that MMP-2, MMP-9, and fibronectin fragments were present in cancer patients but not in healthy individuals (36). To date, a limited number of proteomic studies have utilized MS technologies for the analysis of urological cancers, largely because of the lack of defined methodologies that can reduce the complexity of the sample and rapidly and accurately identify specific proteins. A number of studies have used SELDI-TOF (time-of-flight) to identify peptide signatures associated with bladder cancer, but validation of specific targets has not been forthcoming. Theodorescu and colleagues employed online coupling of capillary electrophoresis to an electrospray TOF mass spectrometer (ESI-TOF-MS) to identify a 22-peptide signature that could discriminate between samples of bladder cancer and healthy subjects (37). The signature was able to correctly classify 31 bladder cancer patients from a cohort that included 149 samples from healthy or nonmalignant urological diseases. A recent study used an iTRAQ (isobaric tag for relative and absolute quantitation) technique to discover proteins that were differentially expressed between pooled urine samples and nontumor controls. This strategy identified 55 candidate biomarker proteins. Orthologous techniques confirmed that the level of apolipoprotein A-I (APOA1) was significantly elevated in urine samples from bladder cancer patients. Using a commercial ELISA assay, APOA1 was confirmed to have high diagnostic potential in an extended sample set (38). Collectively, these studies show the promise of MS-based urinary analysis for the discovery of biomarker panels.
The majority of reported proteomic urine profile studies have used large amounts of sample material. Hundreds of milliliters of urine are typically used for gel-based analysis, and the pooling of samples from different individuals are typically observed in the published gel analysis of normal urine proteome studies to increase the amount of proteins (39). Thus, a reliable and accurate profiling technique requiring only a small amount of urine is essential for investigating urinary proteomes and for marker identification in minimal samples. To improve efficiency and to make the glycoprotein enrichment strategy applicable to minimal sample material, we developed a nano-scale chelating ConA monolithic capillary prepared by using GMA-EDMA (glycidyl methacrylate–co-ethylene dimethacrylate) as polymeric support (40). We used this technique to identify 186 distinct N-linked glycoproteins in as little as 10 mL of naturally micturated human urine. Utility for analysis of minimal biological samples was confirmed by the successful elucidation of glycoprotein profiles in mouse urine samples at the microliter scale (40). As advances in proteomics technologies continue, and findings are collated in established and curated publicly available databases, the future for the discovery and validation of urinary biomarkers of multiple diseases is encouraging.
In this study, we specifically investigated the glycoprotein component of the naturally micturated urinary proteome and compared the profiles obtained from analysis of urine samples obtained from 54 patients with bladder cancer and from 46 noncancer controls. We utilized a dual-lectin affinity strategy to enrich N-linked glycoproteins and incorporated a LC/MS-MS–based label-free shotgun method to identify and compare glycoproteins present in the clinical sample panel. A total of 256 distinct glycoproteins were identified with high confidence, and comparative analysis confirmed that differential urinary glycoprotein profiles exist that can distinguish bladder cancer–bearing patients from individuals with no evidence of bladder cancer.
Quantitative shotgun proteomics, including both stable isotope labeled and label-free techniques, has emerged as a major platform for investigating large-scale protein expression and characterization in complex biological systems (41, 42). Stable isotope labeling approaches, such as isotope-coded affinity tags, iTRAQ, and other metabolic and enzymatic labeling (43, 44), can provide accurate quantitative results. However, these methods are not ideal for the initial large-scale discovery phase, because of high cost, incomplete labeling, and artificial introduction of chemical derivatization. In recent years, an alternative shotgun proteomics strategy based on spectral counting has been developed. With this method, relative protein quantification is achieved by comparing the number of identified MS/MS spectra for a given protein species across samples (45). Previous studies have shown that spectral counting has a strong linear correlation with relative protein abundance (r 2 = 0.9997), with a dynamic range more than 2 orders of magnitude compared with sequence coverage and number of peptide (46). It was also proved to have higher reproducibility and a larger dynamic range than ion-peptide chromatogram (47). Spectral counting has the advantage of achieving cleaner, faster, and simpler quantification results when it comes to large and global protein change studies (25, 48, 49).
Herein, the spectral counting method was employed for quantitative analysis in the present work, and rigorous statistical analysis and subsequent validation were designed to evaluate specific aspects of the method. Supplementary Figures S3 and S4 show the high reproducibility between technical replicates observed for both noncancer and cancer sets. In addition, to evaluate the correlation among different batches based on LTQ runs, we also examined the batch effect between protein extraction sets (Supplementary Fig. S5), revealing a high correlation between technical replicates. When compared within a batch, the N/N pairs were most strongly correlated, followed by the T/T pairs and the N/T pairs in order, reflecting interesting relationships between different disease stages. However, when compared across different batches, the N/N or T/T pairs were not more correlated than the N/T pairs. This suggested that some batch effect was causing variation across different batches based on sample preparation. Therefore, our differential expression analysis was grounded on mixed effect models to reduce the effects of experimental batches. The most promising glycoprotein biomarkers had significant differential expression value when batch factor was accounted for. The subsequent validation results supported this consideration.
Although a number of biomarkers identified for bladder cancer diagnosis may have utility in solid tissue or serum of patient, urinalysis has obvious advantages for the detection of urological cancer. Most importantly, it is available for collection noninvasively. This enables patient compliance, copious sample collection, and repeat sampling. The problem of protein complex formation is reduced in urine relative to serum, urinary peptides and low molecular weight proteins are generally soluble, and problems of degradation during collecting and processing experienced with other tissue and fluid analyses is avoided due to the fact that most proteolytic degradation events have already occurred before voiding (27). A limitation of urinalysis is the large dynamic range in protein and peptide concentration resulting from variations in the intake of fluid and voiding frequency. Appropriate normalization methods have to be selected to counter this problem. For MS data normalization, we applied total spectral count normalization considering protein length to adjust data deficiency and also to prevent abundant proteins from dominating the analysis.
Another challenge in urinalysis is the potential presence of blood proteins because hematuria is a common occurrence in patients with bladder cancer. With blood incursion, serum glycoproteins may be detectable during profiling in some cases. We addressed this problem in a number of ways. We excluded any visibly bloody urine samples and samples defined as having >0.010 mg/dL hemoglobin by urinalysis strip testing (normal levels, see Materials and Methods). However, to confirm that candidate biomarkers were not associated with trace serum incursion, we also examined the potential correlation between the glycoprotein hemoglobin and the target protein from our spectral counting data for each case. We observed no such association (see Supplementary Fig. S7 for scatter plot of A1AT and hemoglobin).
As in our previous study (50), we did identify a number of expected glycoproteins, including some which are under investigation as biomarkers for bladder cancer, but many glycoproteins will not have been selected by the lectin affinity approach used in this study. Specifically, we used ConA and WGA which bind alpha-linked mannose, and GlcNAc groups, respectively. However, the strategy can be expanded to include additional lectins that are now routinely available to broaden the captured glycoprotein profile. The most discriminatory protein identified as being associated with bladder cancer in this study was A1AT, also known as SERPINA1, which describes it as a member of a family of inhibitor of serine proteases. A1AT irreversibly inhibits trypsin, chymotrypsin, and plasminogen activator. In humans, serpins play crucial roles in the maintenance of homeostasis, controlling multiple processes from cellular survival to blood clotting. There are a number of known serpin diseases, including emphysema and liver disease, but relatively little is known of its potential role in cancer. Genetic aberrations of A1AT have been found in some malignancies, including colorectal cancer (51). In a 2-DE and MS proteomics-based analysis, Hamrita and colleagues identified A1AT as upregulated in sera from breast cancer patients with various stages in comparison with healthy women (52). The progression from localized tumor to invasive metastasis involves matrix proteolysis. The balance between proteases and their inhibitors play a key role in this process. Thus, as well as a diagnostic biomarker, A1AT may be a useful indicator for tumor classification, for example, invasive versus superficial bladder cancer, or prognostication. A recent study in tumor tissues identified SERPINA1 expression as being correlated with malignant potential of ovarian cancer (53).
The impetus for our search for bladder cancer biomarkers comes from the idea that an accurate biomarker can reduce the number of cystoscopies done each year and thus cut down the frequency of this invasive procedure. We have been able to establish a comprehensive proteomics strategy for discovery of urinary glycoproteins associated with bladder cancer by using dual-lectin affinity enrichment and LC/MS-MS analysis and conducted the first comprehensive study employing the glycoprotein profiling of noninvasively obtained urine samples. Clearly, larger, confirmatory studies will be required to validate the candidate biomarkers identified in this study and to compare them with cytology and commercial assays, but the data presented here show that it is possible to detect and characterize bladder cancer based on urinary glycoprotein profiles. In combination with other approaches, this strategy will provide targets for inclusion in multiplex biomarker panels that can be used for diagnosis, surveillance, and asymptomatic screening.
Disclosure of Potential Conflicts of Interest
No potential conflicts of interest were disclosed.
This study was supported by NIH/NCI grant RO1 CA108597 (S. Goodison) and Team Science Award 10KT-01, James and Esther King Biomedical Research Program, Florida Department of Health (C. Rosser, S. Goodison).
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
Note: Supplementary data for this article are available at Clinical Cancer Research Online (http://clincancerres.aacrjournals.org/).
- Received January 23, 2010.
- Revision received February 10, 2011.
- Accepted March 18, 2011.
- ©2011 American Association for Cancer Research.