Purpose: Persistently elevated posttreatment plasma EBV DNA is a robust predictor of relapse in nasopharyngeal carcinoma (NPC). However, assay standardization is necessary for use in biomarker-driven trials. We conducted a study to harmonize the method between four centers with expertise in EBV DNA quantitation.
Experimental Design: Plasma samples of 40 patients with NPC were distributed to four centers. DNA was extracted and EBV DNA copy number was determined by real-time quantitative PCR (BamHI-W primer/probe). Centers used the same protocol but generated their own calibrators. A harmonization study was then conducted using the same calibrators and PCR master mix and validated with ten pooled samples.
Results: The initial intraclass correlations (ICC) for the first 40 samples between each center and the index center were 0.62 [95% confidence interval (CI): 0.39–0.78], 0.70 (0.50–0.83), and 0.59 (0.35–0.76). The largest variability was the use of different PCR master mixes and calibrators. Standardization improved ICC to 0.83 (0.5–0.95), 0.95 (0.83–0.99) and 0.96 (0.86–0.99), respectively, for ten archival frozen samples. For fresh plasma with spiked-in EBV DNA, correlations were more than 0.99 between the centers. At 5 EBV DNA copies per reaction or above, the coefficient of variance (CV) was less than 10% for the cycle threshold (Ct) among all centers, suggesting this concentration can be reliably used as a cutoff for defining the presence of detectable EBV DNA.
Conclusions: Quantitative PCR assays, even when conducted in experienced clinical labs, can yield large variability in plasma EBV DNA copy numbers without harmonization. The use of common calibrators and PCR master mix can help to reduce variability. Clin Cancer Res; 19(8); 2208–15. ©2013 AACR.
This is a study to harmonize the measurement of the plasma biomarker EBV DNA in four accredited international labs in order to launch a biomarker-driven international phase III trial in nasopharyngeal carcinoma (NPC). Although EBV DNA is a well-accepted robust prognostic marker in NPC and has been offered in several clinical laboratories as a means to track tumor burden and posttreatment surveillance, little is known about the interlaboratory variability of this quantitative assay. In this study, we showed that the interlaboratory variability is quite large for the same assay using identical procedures and primer/probe set without harmonization. We showed that harmonization, which involves standardization of buffers and calibrators, is feasible and significantly reduces such variability. Through this harmonization process, we established a standardized assay that can be used internationally for the measurement of this biomarker for future prospective studies and developed a process for credentialing new laboratories.
Treatment with concurrent cisplatin (CDDP)-based chemoradiotherapy (CCRT) followed by 3 cycles of adjuvant CDDP and 5-fluorouracil (5-FU) is the current standard of care in the United States for locally advanced nasopharyngeal carcinoma (NPC; ref. 1). Recent advances in radiation delivery and the use of concurrent chemotherapy have substantially increased the rate of local control, now ranging between 91% and 96% (2–6). However, the development of distant metastases remains problematic (∼30% at 5 years) and ultimately results in death (4–6). Compliance to adjuvant chemotherapy after CCRT is problematic as only half of patients completed all adjuvant chemotherapy due to severe toxicity (1, 7). The results of a recent randomized trial from China, which failed to show a survival advantage for adjuvant chemotherapy (8) questioned benefit of adjuvant CDDP/FU. This study was criticized for not using a noninferiority design, and hence was potentially underpowered. In contrast, another study suggested that adjuvant 5-FU chemotherapy decreases distant metastasis in NPC (9). A plausible contributor to these conflicting findings is the inability to properly classify patients with different risk profile for enrollment in trials. The development of a biomarker to reliably identify the subset of patients at high risk for metastasis may help to identify patients that would benefit from adjuvant chemotherapy while sparing low-risk patients from unnecessary toxic treatment.
Blood is in direct contact with all organs and is an attractive sample type for noninvasive cancer surveillance. As the first evidence showing that tumor-associated DNA can be detected in the blood (10), several studies have evaluated tumor DNA as biomarkers for cancer surveillance (11–16). The presence of viral DNA in viral-related tumors offers a distinct marker for detection in the blood. Epstein Barr Virus (EBV) DNA is often found in the plasma of patients with NPC and has been shown to be a reliable marker for prognostication in this disease (14, 15, 17). Specifically, pretreatment plasma EBV DNA correlates with cancer stage (17, 18) and clinical outcome (19) in endemic NPC. Posttreatment (radiation or CCRT) plasma EBV DNA, defined as undetectable, has an even better correlation with prognosis and has been used to monitor recurrence after therapy (17, 18, 20, 21). Although undetectable posttreatment plasma EBV DNA was associated with an excellent progression-free survival (PFS: 80%–90%), persistently detectable level was associated with an extremely poor PFS (10%–15%) and may be a marker of subclinical residual disease (21). These observations have been consistently reproduced in large patient cohorts treated in different countries from both endemic and non-endemic areas (18, 20, 21). Published data to date indicate that the most robust and reliable biomarker for NPC prognostication is the posttreatment EBV DNA level.
Given the robustness of posttreatment plasma EBV DNA in prognostication, we propose to incorporate this biomarker in the next RTOG (Radiation Therapy Oncology Group) phase III NPC trial. We postulate that patients with undetectable EBV DNA after CCRT have a low risk for distant relapse and will be randomized to receive adjuvant cisplatin/5FU (current standard) versus observation. In contrast, those who continue to have detectable EBV DNA levels are at a “high risk” for distant relapse and will be randomized to cisplatin/5-FU (current standard) versus a more intensified regimen. Because this will be an international trial enrolling patients from different countries, the logistical difficulty and high cost associated with shipping plasma samples across the continents, as well as the need for rapid result generation for randomization, it is important to first prospectively validate the assay for EBV DNA and to harmonize it across the different clinical laboratories in participating countries. All four participating laboratories have had extensive experience in measuring plasma EBV DNA in NPC patients and are nationally accredited. Here, we report the results of a prospective effort to harmonize this assay across these four international laboratories in preparation for the upcoming trial.
Although different primer/probe sets have been used to measure circulating EBV DNA in the plasma or serum, the most commonly used primer/probe set is the one targeting the BamHI-W region of the EBV genome (14). This fragment occurs 8 to 11 times in the EBV genome, allowing more sensitive detection when compared to a single copy EBV genes, such as EBNA1, LMP2, or POL1 (18, 22). Because the purpose of the phase III trial is to identify patients with detectable EBV DNA at diagnosis for entry into the trial, and to distinguish patients with detectable levels from those with undetectable levels after chemoradiation for risk stratification and treatment assignment, it is critical that the most sensitive assay be used. More importantly, the largest and most robust published studies that established the prognostic significance of posttreatment circulating EBV DNA in NPC employed the BamHI-W primer/probe set (20, 21, 23). Therefore, we decided to use this assay for EBV DNA measurement for the upcoming trial and focus our efforts to harmonize the assay in the participating laboratories.
Materials and Methods
Four laboratories that were selected for this harmonization include: (i) the Stanford Clinical Virology Laboratory [STF, certified under the Clinical Lab Improvement Amendment (CLIA)], which will serve as the central laboratory for the United States sites, (ii) the Chemical Pathology laboratory at The Chinese University of Hong Kong, Prince of Wales Hospital, Hong Kong [accredited by the National Association of Testing Authorities (NATA) and the Royal College of Pathologist of Australia], which will serve as the central laboratory for Hong Kong sites, (iii) the National Taiwan University Hospital Clinical Laboratory [NTU, Taiwan Accreditation Foundation (TAF) and participated in the College of American Pathologist (CAP) proficiency program)] and (iv) the Chang-Gung Memorial Hospital-Linkou Clinical laboratory (CG, accredited by both TAF and CAP), which will serve as the central laboratories for Taiwanese sites. These laboratories were selected because of their commitment to the planned international phase III RTOG trial mentioned above, their experience in measuring plasma EBV DNA, their accreditation status, and their ability to offer the test as a clinical assay.
For the preharmonization study, anonymized plasma samples from 40 newly diagnosed NPC patients were collected from CG with patient consent and distributed to the 4 laboratories. Patients with all different stages (22 stage I and II and 18 stage III and IV) were included. For the harmonization process, 23 plasma samples of newly diagnosed NPC patients were collected from STF under an institutional review board approved study (NCT00186433), pooled and distributed to the 4 laboratories. For the postharmonization validation study, plasma samples from anonymized 40 patients with NPC were combined to create 10 pooled samples, aliquoted, and distributed.
A total of 3.5 mL of blood was collected into EDTA-coated tubes, centrifuged for 10 minutes, plasma recovered, and frozen at −80°C until shipped. DNA was isolated manually from 400 μL of plasma using the QIAamp Blood Mini Kit (QIAgen Inc), eluted with 50 μL of elution buffer or water. For the harmonization, 2 different operators from the same laboratory conducted DNA extraction and quantitative qPCR to determine interoperator variability.
Calibrators for standard curves
Calibrators were prepared by using DNA extracted from an EBV-positive cell line Namalwa, which is a diploid cell line containing 2 integrated viral genomes per cell as previously described (14). This DNA was also used for the spiking experiments.
Real-time quantitative PCR assay
DNA samples were quantified for EBV DNA using a real-time quantitative (qPCR) system targeting the BamHI-W fragment region of the EBV genome as described by Lo and colleagues (14). Each run included patient samples, calibrators for constructing a standard curve, and appropriate positive, negative, and no template controls. All PCRs were performed in triplicate. The following real-time PCR detector systems were used: STF—Rotor-gene Q (Qiagen) and ABI7900; HK—ABI7300; NTU—ABI7900HT; CG—ABI7900 (Applied Biosystems). The plasma concentration of EBV DNA (copies/mL) was calculated as previously described (14).
All assay results are summarized using mean, SDs, and log-transformed if the data deviates from normality assumptions. Inter-rater reliability is estimated by intraclass correlation coefficient (ICC), a method to assess reproducibility of assay measures among different laboratories. On the basis of Shoukri and colleagues (24), with one-sided 5% type I error rate and 80% power, we need 39 patients to detect a difference of 0.2 assuming ICC of 0.6 under the null hypothesis between 2 laboratories. The within and between subject variations and ICCs are estimated using a general linear model with measurement error (25). The interlaboratory variability in qPCR measurement and DNA extraction was also summarized using the general linear model. Spearman rank correlation coefficient was used to assess results for the “spike-in” experiment among the 4 laboratories. All analyses were conducted using SAS 9.2.
Because of the small volume of plasma collected per patient, we used 400 μL of plasma for DNA extraction instead of the 800 μL that is normally used in the clinical protocol. The results for the initial run using 40 patient samples are shown in Table 1. There was a large interlaboratory variability in the absolute number of EBV DNA copies/mL. The detection rate for EBV DNA, defined here as ≥0 copy/mL was 58% for NTU, 80% for STF, and 93% for both CG and HK. Table 2 shows the ICC for each site compared with STF as the index site. All correlations had less than the desired 0.80 value. The observed variability between centers was not related to differences in qPCR instruments because in one lab (STF), we tested these 40 samples on the Rotor-Gene Q and ABI7900 and observed similar results for both instruments (data not shown).
To harmonize the assay, we identified nonstandardized factors that could be modified; these included calibrators (for standard curve), which were individually prepared in each laboratory, and the TaqMan master mix, which was purchased from 2 different vendors. Three laboratories used the premade Roche master mix (Roche Applied Sciences), whereas one laboratory prepared the master mix in-house using components from ABI (Applied Biosystems). As shown in Table 3, the use of different master mixes, when controlled for other aspects of the procedure, including the DNA extraction, operators, other reagents, calibrators and PCR instruments, resulted in a large difference in the measured EBV DNA copy number. Laboratories using the Roche master mix gave similar results; whereas the laboratory using PCR kits from ABI gave results 5- to 10-fold higher. Therefore, standardization was made to use the Roche master mix in subsequent studies.
Next, we assessed the variability in the qPCR step and DNA extraction by different operators within the same laboratory and by different calibrator sets. To test the interlaboratory variability in qPCR, we provided the 4 laboratories with the same amount of extracted EBV DNA from pooled NPC plasma samples at the concentration of approximately 4000 copies/mL. In one laboratory (HK), the results were available only for one calibrator set. As shown in the top half of Table 4, qPCR variability was much larger for the different calibrator sets than for the different operators. Therefore, we standardized the calibrators using those prepared by HK in all laboratories for the remaining harmonization process. To ensure that calibrator shipping did not result in degradation, we shipped a calibrator set from Hong Kong to the United States then back to Hong Kong and tested it against the same batch that was not shipped. Shipping did not result in a significant decrease in calibrator performance (Supplementary Table S1).
We also evaluated the DNA extraction variability between the 4 different laboratories, using the same DNA extraction kit. In 2 laboratories (STF and NTU), we also assessed DNA extraction variability by different operators. For this analysis, aliquots of a pooled plasma sample with EBV DNA concentration of approximately 4,000 copies/mL were distributed to the four laboratories. As shown in the lower half of Table 4, there was minimal interoperator variability. There was likewise good agreement between the 4 laboratories and the difference was within one PCR threshold cycle (data not shown).
We then validated the harmonization with all laboratories using the HK calibrators, Roche master mix and standardized procedures. Because fresh plasma samples were not available, we pooled 40 archival NPC plasma samples that had previously been frozen and thawed at least once to generate 10 pooled samples with different concentrations. The samples were shipped from Hong Kong to the United States and then to Taiwan. As shown in Table 2, the ICC improved from 0.62 to 0.83 for NTU versus STF; 0.70 to 0.72 for CG versus STF, and 0.59 to 0.96 for HK versus STF. Interestingly, there was significant protein aggregation noted in the plasma samples received by the CG site, presumably due to prolonged stay at room temperature related to delayed shipment delivery. Measurements of EBV DNA from the protein aggregate and the supernatant from the same plasma samples yielded significantly different results with the aggregate showing 2.3 to 2.5 times higher copy number than the supernatant. Therefore, the CG used the left over samples received by the other Taiwanese site, NTU, which did not have much aggregation. Results from this repeated assay showed an ICC of 0.95 between CG versus STF (Table 2). As all laboratories will be testing fresh plasma samples without protein aggregation for the trial, another way to validate the harmonization is to “spike-in” known concentrations of EBV DNA into fresh, negative, non-NPC plasma samples. Table 5 shows the result of the “spike-in” experiment, which was highly consistent between the 4 laboratories. The correlations were >0.99 (P < 0.0001) between Stanford and the other 3 laboratories.
Because an important aspect of the planned phase III trial is to distinguish patients with detectable and undetectable post CCRT plasma EBV DNA for risk stratification, it is important to show that the harmonized assay is sensitive enough to measure a very low level of EBV DNA in the plasma. As there are 8 to 11 units of the BamHI-W fragment in each EBV genome, this assay is more sensitive than assays detecting targets with a single copy. To investigate the detection limit of this assay for all involved laboratories, we analyzed 10 to 20 replicates of diluted DNA from the Namalwa cell line at a concentration of 0.5, 1.25, 2.5, 5, 25, and 100 copies per reaction. Although the assay showed positive signals in several replicates at concentrations below 5 copies per reaction, the CV for the number of PCR threshold cycle (Ct) was greater than the 10% that is normally accepted for a clinical test. At 5 copies per reaction, the CV was consistently less than 10% for all 4 sites. However, even at these low CVs, the SD for Ct can be up to 1.1 cycles. If we use a fixed Ct threshold (mean or median value), up to approximately 50% of the samples having that concentration would be falsely excluded. Therefore, we decided to use the mean Ct value + 2 SDs at the concentration of 5 copies per reaction as a cutoff for defining a detectable level in the subsequent clinical trial. Theoretically, this would include 95% of the samples having an actual concentration of 5 copies/reaction, which translates to 60 copies/mL of patient plasma.
To conduct a biomarker-driven study, it is crucial that the assay for the biomarker is performed in a central laboratory with CLIA or equivalent certification, which applies to all participating laboratories here. However, in situations where it is not feasible to use one central laboratory due to the size of the study and the logistical/cost issues of shipping fresh samples across continents, it is important that the assay be standardized across the participating laboratories. Although plasma EBV DNA is a well-known prognostic marker for NPC and is currently being offered as a clinical test at several institutions, little is known about the inter-laboratory variability of the assay.
Here, we showed that the intraclass variability between the different clinical laboratories with significant experience in this assay could be quite large without harmonization. Major contributors to the inter-laboratory variability were the PCR master mix and calibrators, more so than interoperator variability. Surprisingly, different PCR master mixes yielded more than 5-fold divergence in EBV DNA measurements, despite using the same calibrators, suggesting a difference in amplification efficiencies between plasma DNA and calibrators. Similarly, different calibrator sets, though prepared from the same cell line and protocol, resulted in larger variability than interoperator variability. Hence, harmonization using the same calibrators and PCR master mixes should result in less interlaboratory inconsistency. For the planned clinical trial, calibrators will be prepared in HK and shipped to all sites. Fresh calibrators will also be calibrated against old ones to maintain consistency over time. Shipping calibrators on dry ice did not result in degradation or affect calibrator performance.
The first WHO International Standard for EBV for Nucleic Acid Amplification Techniques NIBSC code: 09/260 became available during the course of this study. Given the current efforts to harmonize this assay, we anticipate that additional harmonization using World Health Organization (WHO) material will further improve correlation and allow the results of future biomarker trials to be more generalizable. Therefore, we plan to use both the WHO standards and the Namalwa DNA as calibrators during the trial to prospectively compare the performance of both standards in a large NPC patient group.
Although harmonization resulted in ICC improvement for all sites, it was less marked for the Taiwanese sites compared to that between STF and HK. Our detailed analyses indicated that prolonged exposure to room temperature of previously frozen plasma samples resulted in marked protein aggregation, which surprisingly influenced the readings. Higher levels were noted from the protein aggregate than the supernatant from the same plasma samples. In contrast, samples without aggregation and the use of “spike-in” samples, which closely resembled fresh plasma, showed minimal variability between the laboratories, confirming the success of the harmonization.
Because of (i) the sensitive nature of qPCR, (ii) the amplification factor used to convert copies per reaction to copies/mL that can magnify interoperator and interlaboratory variability, and (iii) Ct being the direct assay readout, we believe that the Ct value is more useful than copy/mL in defining the minimum detection limit of this assay for risk stratification. Although two (STF and HK) laboratories were able to detect EBV DNA at a concentration as low as 0.5 copy/reaction (i.e., 6 copies/mL), the detectability of EBV DNA at this level was relatively unpredictable for the other 2 laboratories (CG and NTU). Therefore, if we simply use any detectable level of EBV DNA as the criteria for risk stratification, significant intercenter variation would be expected. On the other hand, if we use a fixed quantitative cutoff value, the variation in quantitative measurement may also result in significant inter-center differences. For example, if we measure 100 plasma aliquots, each with a putative EBV DNA concentration of 500 copies/mL, and use the median measured concentration as a cutoff, then 50 aliquots would have a measured concentration above and 50 would have a measured concentration below the cutoff. The samples with measured concentration below the cutoff value by random variation would be falsely rejected. To resolve this potential confounding issue, we propose to use a cutoff value derived from the mean Ct of a concentration that all 4 laboratories can consistently detect (≥5 copies/reaction) plus 2 SDs. Using this cutoff ensures that 95% of the samples having an actual concentration of 5 copies/reaction or 60 copies/mL would be correctly classified as detectable. To obtain the cutoff for each run, the laboratory will include 10 to20 replicates of 5 copies per reaction to accurately determine the mean and SD for the Ct value at this concentration; the detectability point will be at 2 SDs above the mean Ct value. Plasma samples having a Ct below this cutoff (corresponding to a higher EBV DNA copy number) will be regarded as having a detectable level of EBV DNA.
In summary, we detected significant variability in plasma EBV DNA measurements between different clinical laboratories, which substantially improved with harmonization. This establishes a standardized assay that can be used internationally for the measurement of this biomarker for future prospective studies. It also provides a process for credentialing new laboratories, and ensures that the trial results will be applicable to the real world. The development of clinically actionable biomarkers is the key to personalized medicine and this harmonization is an important benchmark for all future biomarker-driven studies.
Disclosure of Potential Conflicts of Interest
Y.M.D. Lo is a consultant/advisory board member of Sequenom. No potential conflicts of interest were disclosed by the other authors.
Conception and design: Q.-T. Le, Q. Zhang, H. Cao, N.Y. Lee, K.-K. Ang, A.T.C. Chan, K.C.A. Chan
Development of methodology: Q.-T. Le, H. Cao, K.-C. Tsao, Y.M.D. Lo, K.C.A. Chan
Acquisition of data (provided animals, acquired and managed patients, provided facilities, etc.): Q.-T. Le, A.-J. Cheng, B.A. Pinsky, R.-L. Hong, J.T.-C. Chang, J.T.-C. Chang, C.-W. Wang, K.-C. Tsao, N.Y. Lee, A.T.C. Chan, K.C.A. Chan
Analysis and interpretation of data (e.g., statistical analysis, biostatistics, computational analysis): Q.-T. Le, Q. Zhang, H. Cao, A.-J. Cheng, B.A. Pinsky, K.-C. Tsao, K.-K. Ang, K.C.A. Chan
Writing, review, and/or revision of the manuscript: Q.-T. Le, Q. Zhang, B.A. Pinsky, J.T.-C. Chang, C.-W. Wang, Y.M.D. Lo, N.Y. Lee, K.-K. Ang, A.T.C. Chan, K.C.A. Chan
Administrative, technical, or material support (i.e., reporting or organizing data, constructing databases): J.T.-C. Chang, Y.M.D. Lo, A.T.C. Chan
Study supervision: Q.-T. Le, K.-C. Tsao, Y.M.D. Lo, A.T.C. Chan
This study was supported by grants U10 CA21661 from the National Cancer Institute (NCI), the Li Ka Shing Foundation (to Y.M.D. Lo, A.T.C. Chan, and K.C.A. Chan).
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
Note: Supplementary data for this article are available at Clinical Cancer Research Online (http://clincancerres.aacrjournals.org/).
- Received November 30, 2012.
- Revision received February 6, 2013.
- Accepted February 6, 2013.
- ©2013 American Association for Cancer Research.