Purpose: Fluorine-18 fluorodeoxyglucose positron emission tomography with CT attenuation correction (18F-FDG PET/CT) is useful in the detection and enumeration of focal lesions and in semiquantitative characterization of metabolic activity (glycolytic phenotype) by calculation of glucose uptake. Total lesion glycolysis (TLG) and metabolic tumor volume (MTV) have the potential to improve the value of this approach and enhance the prognostic value of disease burden measures. This study aims to determine whether TLG and MTV are associated with progression-free survival (PFS) and overall survival (OS), and whether they improve risk assessments such as International Staging System (ISS) stage and GEP70 risk.
Experimental Design: 192 patients underwent whole body PET/CT in the Total Therapy 3A (TT3A) trial and were evaluated using three-dimensional region-of-interest analysis with TLG, MTV, and standard measurement parameters derived for all focal lesions with peak SUV above the background red marrow signal.
Results: In multivariate analysis, baseline TLG > 620 g and MTV > 210 cm3 remained a significant factor of poor PFS and OS after adjusting for baseline myeloma variables. Combined with the GEP70 risk score, TLG > 205 g identifies a high-risk–behaving subgroup with poor expected survival. In addition, TLG > 205 g accurately divides ISS stage II patients into two subgroups with similar outcomes to ISS stage I and ISS stage III, respectively.
Conclusions: TLG and MTV have significant survival implications at baseline and offer a more precise quantitation of the glycolytic phenotype of active disease. These measures can be assessed more readily than before using FDA-approved software and should be standardized and incorporated into clinical trials moving forward. Clin Cancer Res; 23(8); 1981–7. ©2016 AACR.
Total lesion glycolysis (TLG) and metabolic tumor volume (MTV) are volumetric parameters applicable to 18F-FDG PET/CT scans that more accurately than conventional measurements reflect the glycolytic phenotype and overall tumor burden of focal lesions in multiple myeloma. TLG and MTV are highly associated with progression-free and overall survival in addition to enhancing traditional disease burden scores such as the GEP70 risk score and International Staging System (ISS) stage. The addition of a high TLG component to GEP70 risk identifies patients classified as low risk, but with an outcome similar to a high-risk patient. Also, the additional information of TLG to the ISS divides ISS stage II cases into low and high-risk behaving subgroups with similar outcomes to ISS stage I and ISS stage III, respectively. Therefore, the assessment of TLG and MTV can significantly improve the prognostic value of GEP and ISS in myeloma.
Focal lesions, consisting of collections of abnormal plasma cells, develop first in the bone marrow, and remain confined there until subsequently expanding beyond bone by direct extension or development of independent, extramedullary foci. Lesions can be detected using MRI and fluorine-18 fluorodeoxyglucose positron emission tomography with CT attenuation correction (18F-FDG PET/CT), and are detectable by these advanced imaging techniques in advance of the appearance of lytic lesions of bone on radiography or CT. A complex relationship exists between myeloma cells and the stroma, osteoblasts, and osteoclasts in the focal lesions which favors proliferation and survival of the myeloma clone. Importantly, bone-signaling networks are disturbed and, as the lesions progress, osteolysis occurs leading to the development of the lytic lesions typical of myeloma. As well as contributing to the overall clinical symptoms of myeloma patients, focal lesions act as reservoirs of drug-resistant cells favoring disease progression and relapse. Understanding the nature of focal lesions and their value as a predictive test of clinical outcome is an important area of research.
To date, clinical reporting of focal lesions has almost exclusively taken account of their number and uptake intensity (1). The prognostic value of such measurements has been documented in monoclonal gammopathy of undetermined significance (MGUS), smoldering myeloma, and myeloma, where the number of lesions and their maximum standardized uptake values (SUVmax) have been shown to correlate with progression-free (PFS) and overall survival (OS; refs. 2–6). In this context, we have shown that, in newly diagnosed patients, the number of PET lesions (0, 1–3 and >3) correlates with clinical outcome, with patients with >3 lesions having a worse complete remission duration (CRD), PFS, and OS (1, 2). Having more than 3 focal lesions has also been linked to GEP70-positive high-risk disease as well as to higher proliferation (3). The SUVmax of focal lesions is also important with values exceeding 3.9 correlating with impaired CRD, PFS, and OS (2), as well as with higher numbers of lesions and GEP70 high-risk status (3). We and others have also shown that the persistence of PET uptake after therapy is linked to impaired clinical outcome (1, 2, 4) with the persistence of >3 lesions at day 7 after treatment and/or before stem cell transplantation being linked to inferior PFS and OS (1, 2).
The number and maximum intensity of lesions are makers of disease burden and glycolytic activity, respectively. Total lesion glycolysis (TLG) is theoretically superior to these measurements because it incorporates both parameters while taking into account the level of glucose accumulation within the total volume of all the regions of interest. Metabolic tumor volume (MTV) quantifies the total metabolic tumor burden. Commercially available algorithms assisting in measurement of both of these variables are now available which facilitate standardization of the methods.
There is increasing evidence for the prognostic value of quantitative parameters obtained from initial staging using 18FDG PET/CT in patients with many solid tumors, lymphoma, and myeloma (7–11). To date, the SUVmax has been the most widely studied parameter, with higher levels of glucose accumulation serving as a proxy for glycolytic phenotype correlating with tumor grade in solid tumors (12). More recent studies include the volume-based metabolic assessments such as MTV and TLG (7). The focus of this study was to examine the role of TLG and MTV in myeloma and to determine whether these variables have increased clinical utility compared with the previously used variables which were based on measuring the number of lesions and determination of SUVmax. The analysis considers the ability of TLG, in combination with previously defined risk scores such as the GEP70 risk score and International Staging System (ISS) stage, to identify patients with high-risk disease features.
Materials and Methods
A total of 192 patients enrolled in Total Therapy 3A, the details of which have been previously reported, underwent baseline whole body 18F-FDG PET/CT (13, 14). 108 of the 303 total multiple myeloma patients who had undergone Total Therapy 3A (TT3A) treatment were excluded from the study because their baseline 18F-FDG PET/CT study could not be retrieved from the data archive. Three patients were excluded because their 18F-FDG PET/CT studies were considered nondiagnostic. As it was random which 111 patients from TT3A protocol had corrupted PET data files or were nondiagnostic, we believe omitting these patients enters no bias into the analysis (Supplementary Table S1). As of March 17, 2015, the mean follow-up was 8.46 years. Fifty-one patients (27%) were over the age of 65 years, 65 of 192 (34%) had an albumin < 3.5 g/dL, 81 of 192 (42%) had a B2M ≥ 3.5 mg/L, and 60 of 192 (31%) had an elevated LDH (Table 1). The GEP risk assessment and molecular subgroup distribution were typical of a group of newly diagnosed patients with 27 of 176 (15%) patients being high risk by GEP70. The GEP70 risk score has previously been shown to be highly predictive of survival in myeloma with low-risk (LR) and high-risk–identified cases having disparate median overall survival times of 24 and 120+ months, respectively (15). The protocol was approved by the University of Arkansas Medical Sciences Institutional Review Board, and all patients signed informed consent in keeping with institutional, federal, and international guidelines. Response was assessed using the European Bone Marrow Transplant (EBMT) criteria.
Baseline PET/CT scans were performed after 6–8 hours of fasting and after the intravenous administration of 10–15 mCi (370–555 Mbq) of 18F-FDG. After 50–70 minutes of uptake, images were acquired on either a CTI-Reveal or a Biograph 6 PET/CT system (Siemens Medical Systems), both with full ring LSO crystal configurations. PET images were generated by three-dimensional (3D) iterative reconstruction on a 168 × 168 matrix, with a zoom of 1.0, FWHM filter of either 5.0 or 6.0 mm, and 2 iterations with 8 subsets. CT data were used for localization and attenuation correction. 18F-FDG PET/CT images underwent a 3D region-of-interest analysis of the axial and appendicular skeleton using the commercially available, FDA-approved “Mirada Medical PET/CT XD Oncology Review” software (Mirada Medical).
The background red marrow of each patient was defined by using a 1 cm3 diameter region of interest in the most inferior vertebral body which did not demonstrate focally increased FDG uptake or vertebroplasty material. Focal lesions for each patient were defined as focal areas, measuring at least 1 cm in diameter, not otherwise demonstrated to be artefacts by comparison with coregistered CT, recognizable as discrete foci of increased 18F-FDG uptake on maximum intensity projection images (MIP), and exhibiting a peak SUV (SUVpeak) greater than the SUVpeak for the patient's background red marrow (Fig. 1). The volume of each lesion and its 3D margins were determined by incorporating all contiguous pixels with activity greater than 0.1 g/mL above that of the background marrow. Because of the considerable statistical variability inherent in the acquisition, reconstruction, and display of accumulations of radiopharmaceuticals in the clinical imaging setting, SUVs obtained from larger regions of interest (ROI) are more reproducible than single pixel determinations such as SUVmax. For this reason, we have chosen to quantify activity by calculating the SUVpeak defined as the average SUVs, corrected for lean body mass, of the pixels in a sphere 1.2 cm in diameter (1 cc) centered to include the most intense pixel (16). The total MTV for disease in each patient was defined as the sum of MTVs of all the individual focal lesions identified in the analysis. The TLG of each focal lesion was calculated by multiplying the MTV of that lesion with its corresponding mean SUV. The global TLG of each patient was defined as the sum of the TLGs for all the focal lesions in the analysis.
Statistical analysis was performed using R 3.2.2 (17) and SAS (SAS Institute) software. Univariate and multivariate analyses of clinical and imaging variables were performed using Cox proportional hazards regression. Because of high correlation, TLG and MTV scores were considered in separate analyses. A P value of <0.05 was considered statistically significant. Survival analysis was performed using the Kaplan–Meier method and log-rank tests. Clinical endpoints included OS, PFS, and CRD. CRD is measured as the time from complete response onset to disease progression or death from any cause. To determine the cut-off points on the Kaplan–Meier curves for TLG and MTV, the running log-rank statistic produced by each hypothetical cut-off point in the dataset was graphed against TLG score, and the TLG score that coincided with the highest log-rank statistic was chosen as the cut-off point. The cut-off points for both TLG and MTV were based on PFS endpoint. Random permutations (1,000) of both TLG and MTV were performed and optimal cut-off points determined that maximized the log-rank statistic, chosen with 80% of possible values of the covariate. Permuted log-rank statistic values were determined for the 0.01 and 0.05 significance levels. The optimal binary cut-off points found for the true covariate values and its log-rank test statistic exceeded that of randomly permuted values at both the 0.01 and 0.05 levels.
Baseline characteristics show that 62 of 192 patients (32%) had no detectable lesions by PET. One to 3 lesions were present in 60 of 192 (31%) of patients and over 3 lesions in 70 of 192 (36%) of patients (Table 1). A baseline TLG > 205 g was seen in 18% (34/192) of patients, with a TLG > 620 g being seen in 7% (14/192). A baseline MTV > 210 cm3 was seen in 7% (14/192). The distribution of patients with high TLG scores between molecular subgroups was not even with patients in the PR, MF, and HY subgroups having higher scores (P = 0.0135; Supplementary Fig. S1A; ref. 18).
The 7-year PFS and OS for patients with a baseline TLG of less than 205 g was 61.4% and 74.1% compared to patients with a baseline TLG between 205 g and 620 g of 40.0% and 55.0%, and to patients with a baseline TLG of greater than 620 g of 0%, as no patients were progression free for 7 years, and 7.1% (P < 0.0001) for OS (Fig. 2). As expected, survival curves for PET focal lesions also show differences in PFS and OS based on focal lesion count (Supplementary Fig. S1B). A similar pattern was seen for CRD (Supplementary Fig. S1C).
The MTV had prognostic importance with 7-year PFS and OS for patients with a baseline MTV of less than 55 cm3 of 63.3% and 75.5% compared with patients with a baseline MTV between 55 cm3 and 210 cm3 of 35.5% and 54.8%, and to patients with a baseline MTV of greater than 210 cm3 of 7.1% and 7.1% (P < 0.0001). The Kaplan–Meier survival curves are very similar to that of the TLG (Supplementary Fig. S1D). Pearson correlation coefficient (r) between MTV and TLG was 0.94232 (P < 0.0001).
In univariate analysis, baseline PET/CT with >3 focal lesions [HR = 1.83; 95% confidence interval (CI), 1.17–2.86; P = 0.0076], TLG > 205 g (HR = 2.99; 95% CI, 1.83–4.88; P < 0.0001), baseline TLG > 620 g (HR = 6.90; 95% CI, 3.67–12.97; P < 0.0001), and MTV > 210 cm3 (HR = 6.20; 95% CI, 3.33–11.54; P < 0.0001) were statistically significant and associated with OS and PFS (Table 1).
In multivariate analysis, PFS and OS were dominantly affected by baseline TLG > 620 g, together with high B2M, LDH, creatinine, and GEP-based centrosome index (Table 2). Importantly, baseline TLG > 620 g and baseline MTV > 210 cm3 had worse PFS and OS than baseline PET with >3 focal lesions in both univariate and multivariate models. Considering OS outcome, TLG > 620 g and baseline MTV > 210 cm3 were the dominant adverse features imparting a 4.97- and 5.49-fold higher risk of death, respectively. Similarly, for PFS outcome, the most prominent adverse feature with an approximately 5.5-fold higher risk of relapse or death was observed for both TLG > 620 g and MTV > 210 cm3.
Usmani and colleagues (1) have previously described the prognostic value of FL number and SUVmax in the TT3 population. Using the cut-off point, Usmani and colleagues reported as highly associated with survival outcomes, SUVmax >3.9, we found that SUVmax remained significant in the univariate analysis (HR = 1.59; 95% CI, 1.01–2.5; P = 0.0422) but did not remain a significant covariate in the multivariate analysis as TLG and MTV did. The Kaplan–Meier survival curves comparing SUVmax and TLG > 205 g show that TLG is more highly associated with PFS and OS than SUVmax (Supplementary Fig. S1E).
Finally, in our analysis, we explored the role of TLG, GEP70, and ISS stage in identifying patients with high-risk disease. When incorporating differences in TLG scores, we find a subset of GEP70 LR disease and high TLG (TLG > 205 g) patients with similar outcome to GEP70 high-risk patients (log-rank P = 0.7013 for PFS and 0.5818 for OS; Fig. 3A). These LR and TLG-high cases make up 13% of all LR, and when combined with the GEP70 high-risk now form a larger high-risk–behaving cohort comprising 27% of the total patients with a 3-year PFS and OS of 60%. Patients with both LR and low TLG have 7-year PFS and OS rates of 63.6% and 78.3%, compared with LR and high TLG of 30.0% and 40.0% (P < 0.0001). GEP70 risk and TLG are independent prognostic factors (χ2 = 9.82; P = 0.001). The additional information provided by TLG scores that relate to outcome also divides ISS stage II cases into low- and high-risk–behaving groups. These subgroups have similar outcomes to ISS stage I and ISS stage III, respectively (Fig. 3B). Patients from this cohort, with baseline ISS stage II, have 7-year PFS and OS of 58.8% and 69.1%. When TLG is included to separate poor performing patients, ISS stage II with low TLG have 7-year PFS and OS of 68.5% and 77.8% compared with ISS stage II patients with high TLG of 21.4% and 35.7%.
This is among the largest series to evaluate volumetric PET measurements in oncology, and is the largest to date evaluating volumetric assessment of PET in multiple myeloma. In this work, we show that baseline TLG > 620 g and baseline MTV > 210 cm3 are significantly associated with poor PFS and OS in patients with myeloma. While TLG, MTV, and the number of focal osseous lesions were all found to be statistically significant in evaluating tumor burden, TLG and MTV were found to be more highly associated with survival than the number of focal osseous lesions. In a smaller study of 47 patients, Fonti and colleagues explored the role of TLG and MTV and prognosis in a mixed group of patients who received various therapies (19). They noted that a MTV value of 77.6 mL and a TLG value of 201.4 g predicted patients with a good OS. Our study extends these findings in a clinical trial setting, comparing these functional parameters to the traditional metrics of lesion number and SUVmax, and demonstrates their clinical utility in predicting both PFS and OS.
As well as increased predictive/prognostic power, measurement of TLG and MTV has a further clinical advantage over the manual determination of the SUVmax and number of lesions. Recent improvements in commercially available software now render the process of lesion detection and delineation more practicable than with the labor-intensive techniques required in the past. These advances result in a higher degree of reproducibility and, therefore, a greater clinical accuracy and utility. Although the software for calculating TLG and MTV are FDA approved, international standards for image acquisition and lesion delineation should be developed to facilitate meaningful data comparisons in clinical research and trials.
TLG as a means of PET evaluation of neoplasms was first described by Larson and colleagues in 1999 (7). More than 140 studies of the value of FDG/PET TLG assessment in the evaluation of solid tumors have been published since this initial publication describing its utility in monitoring disease response to chemotherapy. In these studies, a number of different methods and a wide range of threshold levels have been proposed to calculate the volume-based PET parameters. Some studies suggest that TLG may be superior to MTV, although this remains controversial. Our data would be in keeping with its superiority, as comparing our results to those of the smaller study of Fonti and colleagues, the TLG values are similar but the MTV values are different (TLG 205 g/620 g vs. 201.4 g, and MTV 55/210 mL vs. 77.6 mL; ref. 19).
The ability to detect focal lesions in plasma cell dyscrasias is becoming increasingly more important. As well as being able to predict prognosis in myeloma, recent studies also demonstrate that patients with MGUS and smoldering myeloma who have focal lesions on MRI have a shorter time to progression to myeloma (5, 6). These studies indicate that patients with smoldering myeloma with more than one lesion on MRI should be considered as having a myeloma-defining event and be offered therapy (20).
In addition from the clinical and biological perspective, eradicating these metabolically active lesions represents a significant therapeutic challenge. A number of groups have now shown that the continuing presence of PET-positive lesions after therapy either at day 7 of therapy (1), before (2) or after autologous transplantation (4) is a poor prognostic feature. The quantitative assessment of therapeutic response using the TLG method therefore represents an opportunity to standardize and improve the early assessment of therapeutic response.
We have demonstrated previously that the expression of a series of genes (e.g., DKK1 and LRP8) as well as genes associated with high-risk disease, are linked to PET, MRI, and X-ray changes (3). In this study, we extend these findings and demonstrate that patients within the PR, MF, and HY subgroups have a higher TLG score suggesting they constitute a group of hypermetabolic myelomas characterized by a more aggressive glycolytic phenotype with adverse clinical parameters and poor clinical outcome. On the basis of these observations, we hypothesize that focal lesions act as a reservoir of cells which contribute to disease relapse. As the cells reside in a distinct but separate microenvironment, they are exposed to different selective pressures to that of the wider bone marrow resulting in the development of intraclonal heterogeneity.
In addition, we identify a subset of patients with high TLG scores, classified as GEP70 LR, with adverse outcomes similar to high-risk patients. The GEP70 was of particular interest to us as it is a published, highly discriminatory risk assessment tool. GEP70 defined high-risk patients currently make up approximately 15% of the myeloma patient population, have extremely poor expected outcome, and are being specifically targeted in clinical trials. The TLG score has the potential to supplement and expand the GEP70′s definition of a high-risk patient. The GEP70 combined with the TLG score was able to identify an additional group of patients with high TLG scores, who are classified as GEP70 LR, but have equally poor outcomes as the GEP70 high-risk group. This group would not be identified by our current genetic analyses alone. Only by combining the two independent prognostic factors of imaging and genetics do we see added benefit. This could be immensely useful in enhancing the GEP70 at defining patients who have poor outcome and are in need of more aggressive treatment. Furthermore, we demonstrated the ability of TLG to neatly discriminate ISS stage II patients into two groups who clinically perform more like patients in ISS stage I and ISS stage III based on low and high TLG, respectively.
In conclusion, while the number of focal lesions, SUVmax, TLG, MTV, and are clinically useful in evaluating tumor burden and glycolytic phenotype in multiple myeloma, TLG and MTV are highly associated with OS and PFS compared with the assessment of the number and SUV of focal lesions. As these volumetric measurements become more readily utilized using FDA-approved software, these findings strengthen the argument for validating such measures in clinical trials as a basis for modifying therapy in the subset of patients with high TLG at baseline.
Disclosure of Potential Conflicts of Interest
C.J. Heuck is an employee of Janssen R&D. G.J. Morgan is a consultant/advisory board member for Amgen, Celgene, and Takeda-Millenium. F.E. Davies is a consultant/advisory board member for Celgene. No potential conflicts of interest were disclosed by the other authors.
Conception and design: J.E. McDonald, M.M. Kessler, F. van Rhee, B. Barlogie
Development of methodology: J.E. McDonald, M.M. Kessler, A.F. Buros, B. Barlogie
Acquisition of data (provided animals, acquired and managed patients, provided facilities, etc.): J.E. McDonald, M.M. Kessler, M.W. Gardner, J.A. Ntambi, S. Waheed, F. van Rhee, C.J. Heuck, N. Petty, C. Schinke, S. Thanendrarajan, B. Barlogie
Analysis and interpretation of data (e.g., statistical analysis, biostatistics, computational analysis): J.E. McDonald, M.M. Kessler, M.W. Gardner, A.F. Buros, F. van Rhee, A. Mitchell, B. Barlogie, F.E. Davies
Writing, review, and/or revision of the manuscript: J.E. McDonald, M.M. Kessler, A.F. Buros, S. Waheed, F. van Rhee, M. Zangari, C. Schinke, A. Hoering, G.J. Morgan, F.E. Davies
Administrative, technical, or material support (i.e., reporting or organizing data, constructing databases): J.E. McDonald, M.M. Kessler, N. Petty
Study supervision: J.E. McDonald, M.M. Kessler, F. van Rhee, N. Petty, F.E. Davies
Other (conceived the idea of application of TLG evaluation to the TT3 population and the idea of examining its utility compared to the ISS and is the first author): J.E. McDonald
This work was supported in part by grant PO1 CA 55819 from the National Cancer Institute.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
We thank the patients and staff of the Myeloma Institute.
Note: Supplementary data for this article are available at Clinical Cancer Research Online (http://clincancerres.aacrjournals.org/).
G.J. Morgan and F.E. Davies are co-senior authors of this article.
- Received January 28, 2016.
- Revision received July 28, 2016.
- Accepted August 22, 2016.
- ©2016 American Association for Cancer Research.