We thank Drs. Sharma and Ratain for their commentary (1) on our article regarding toxicity attribution in phase I trials (2). The authors raise important points. First, they observe that we only evaluated studies utilizing single agents, and thus, our findings may not be generalizable to phase I combination trials. Phase I combination trials frequently evaluate multiple dose escalations, and these independent and often incomparable dose escalations make it difficult to express the dose an individual patient received relative to the maximum-administered dose (MAD) achieved in the entire trial without restrictive assumptions. Percentage of MAD was a key measure used in our analysis and can be determined without such assumptions in the single-agent trial setting. Combination trials present unique challenges for attribution, because it can be difficult to ascertain which study drug, or whether the combination regimen, caused or exacerbated an individual toxicity. This granularity is useful for dose reductions; however, the overall tolerability of a combination regimen is defined by all the toxicities the combination causes. When considering the relatedness of toxicities to the drug combination overall rather than individual agents, it is unclear whether physicians would truly be more error prone when attributing toxicities caused by multiple-drug regimens when compared with single agents. The methodology presented in our article could be adapted to address the combination question by expressing the MAD separately for each dose escalation.
Sharma and Ratain also indicate that at dose levels approaching 0% of the MAD, grade ≥ I toxicities, at least possibly drug related, were common (60% of patients). We agree that the y-intercepts (rate of toxicity at ‘zero’ dose level) on Figs. 1A and 2C–E likely reflect a systematic tendency for physicians to overattribute toxicities to drug (Supplementary Fig. S3 of our article demonstrates this). However, lack of a gold standard prevents us from definitively saying how much misattribution may have contributed to the adverse event (AE) rates at the lowest doses. In a related point, the authors suggest that the higher rate of “possibly” related grade ≥ 1 toxicity at low dose levels may be taken as indirect evidence that these “possibly” related toxicities are more likely to be truly unrelated (and thus misattributed) than grade ≥ 1 “probably” or “definitely” related toxicities. It is important to note that these differences are likely partially explained by the marginal rates of these types of toxicities (“possibly” related, 30%; “probably” related, 12%; and “definitely” related, 4%). Also, differences in the model predicted rates of “possibly,” “probably,” and “definitely” related toxicities at dose 0 were extrapolated from doses > 0. Most importantly, the corresponding y-intercepts for more clinically meaningful (grade ≥ 3) toxicity, which usually determine the MTD in phase I trials, are much lower across all attribution categories and are similar between “possibly,” “probably,” and “definitely” related categories. Given these factors, we believe caution is necessary when attempting to interpret the clinical significance of differences in y-intercepts across the 5 categories of relatedness.
We also agree that additional useful discrimination may be provided by a 5-tier attribution system; however, there is currently no design in use that can accommodate a probabilistic attribution scale. We previously developed a novel design whereby different toxicities are weighted based on the likelihood of an AE being drug related (3). The design allows physicians to express the degree of confidence in attribution on a continuous scale, which can evolve over time in response to dynamic physician learning. (4). Although phase I trial designs currently in use, with a binary classification system of drug relatedness, are reliable provided that the classification error rates are negligible, the probabilistic design is superior to the current designs in the presence of misattribution (3). Because the probabilistic design does not require every AE to be recorded as either a dose-limiting toxicity (DLT) or non-DLT but rather allows AEs for which attribution is associated with some uncertainty to be given a probability score, the impact of attribution error is diminished (3).
A more refined system of attribution that allows physicians to assign a probability score capturing the uncertainty in their assessment that better represents what occurs in clinical practice would be an improvement over the current attribution system. We would reassert that the results of our work suggest that within the framework of the currently used designs and endpoints (DLTs), the standard binary classification system of drug relatedness is adequate.
Disclosure of Potential Conflicts of Interest
No potential conflicts of interest were disclosed.
- Received December 16, 2015.
- Accepted January 8, 2016.
- ©2016 American Association for Cancer Research.