Abstract
Purpose: Competing risks observations, in which patients are subject to a number of potential failure events, are a feature of most clinical cancer studies. With competing risks, several modeling approaches are available to evaluate the relationship of covariates to causespecific failures. We discuss the use and interpretation of commonly used competing risks regression models.
Experimental Design: For competing risks analysis, the influence of covariate can be evaluated in relation to causespecific hazard or on the cumulative incidence of the failure types. We present simulation studies to illustrate how covariate effects differ between these approaches. We then show the implications of model choice in an example from a Radiation Therapy Oncology Group (RTOG) clinical trial for prostate cancer.
Results: The simulation studies illustrate that, depending on the relationship of a covariate to both the failure type of principal interest and the competing failure type, different models can result in substantially different effects. For example, a covariate that has no direct influence on the hazard of a primary event can still be significantly associated with the cumulative probability of that event, if the covariate influences the hazard of a competing event. This is a logical consequence of a fundamental difference between the model formulations. The example from RTOG similarly shows differences in the influence of age and tumor grade depending on the endpoint and the model type used.
Conclusions: Competing risks regression modeling requires that one considers the specific question of interest and subsequent choice of the best model to address it. Clin Cancer Res; 18(8); 2301–8. ©2012 AACR.
See commentary by Chappell, p. 2127
Translational Relevance
Competing risks are common in clinical cancer research, as patients are subject to multiple potential failure events, both disease related and otherwise. To evaluate how treatment and patient or disease characteristics relate to these outcomes, competing risks regression models are used. Depending on which model is used, a distinctly different picture of the relationship of covariates to outcomes may be seen. It is important to choose a modeling approach that addresses the question of interest and subsequently interpret the results accordingly. Here, we provide examples and guidance for which model may be most appropriate depending on the specific goals and interests of the analysis.
Introduction
Competing risks methods are common in biomedical research, particularly in cancer, in which the need to deal with multiple potential outcomes is nearly ubiquitous. For example, competing risks are encountered when cancer patients are followed after treatment, and their first failure event may be local recurrence, distant metastases, onset of second primary cancer, or death precluding these events. Statistical analysis and interpretation of competing risks data differ from survival analysis with only a single cause of failure. First, appropriate methods must be applied to obtain a correct estimate of the cumulative probability of each event (1–5). Second, when comparing causespecific outcomes between groups, one must also consider the influence of competing risks and choose the test appropriate for the question of primary interest (6–8). For example, one may want to compare the causespecific hazard of a given event type, which heuristically reflects the rate per unit of time of the event among those not having failed from other events, or the cumulative incidence of the event, which reflects the rate of the event as well as the influence of competing events.
In this article, we address these analyses in the context of regression modeling in the presence of competing risks, focusing on the 2 most commonly used approaches (9, 10). As in the case of estimation and testing (8), the quantities produced by different competing risks models on the same data, generally referred to as “hazard ratios,” can differ substantially and may lead to different or even seemingly contradictory inferential conclusions. In fact, the method of summary and inference for competing risks data has occasionally been a source of debate and contention, primarily because each of the different approaches may only illuminate one important aspect of the data while possibly obscuring others (11–14). Similarly, in the context of modeling, it has been correctly pointed out that effects of prognostic covariates in one model type do not necessarily pertain those that would be obtained in the other type of model (15, 16). We argue that this does not diminish the usefulness of either approach and the choice is rather a matter of the question of interest.
After briefly reviewing competing risks methods in the next section, we present simulation studies of competing risks models under various realistic scenarios. We then present an example from a clinical trial of the Radiation Therapy Oncology Group (RTOG). Finally, we offer guidance for which modeling approach might be most appropriate depending on the question of interest.
Background
Competing risks data, estimators, and tests
Fundamentals of competing risks have been reviewed extensively elsewhere (1–5, 8, 16, 17). Briefly, in competing risks data, an individual can potentially fail from any of several, say K, event types, but only the time to failure for the earliest (in time) of these (or the last followup time if no failure has yet occurred) is observed. Thus, the observations take the form of (T, δ), denoting a failure time T and a failure status indicator δ that records either the type of failure that occurred or indicates that no failure has yet occurred. Although only one event time is recorded, there is partial information on all failure types. For example, if the patient failed from δ = cause 1 at T = 24 months, it is known that the patient has achieved 24 months free of failure from cause 2.
The principal identifiable quantity in competing risks observations is the causespecific hazard function λ_{k}(t), which heuristically represents the probability of failure due to cause k at a moment in time (t), given that no failure of any kind has occurred thus far. The cumulative causespecific hazard Λ_{k}(t) equals the causespecific hazard summed from start of observation to time t. The causespecific hazards for k mutually exclusive events are additive to the hazard of failure from any of the events, so that the quantity Λ(t) = Λ_{1}(t) + Λ_{2}(t) + … + Λ_{k}(t) is the cumulative hazard function for failures from any cause. The cumulative hazard function has corresponding survival function S(t) [probability of remaining eventfree past time (t)] via the identity S(t) = exp(–Λ(t)).
Another important quantity is the cumulative probability of event k over time in the presence of other competing events. The cumulative incidence function F_{k}(t) for cause k is defined mathematically as
F_{k}(t) involves both the probability of having not failed from some other event first up to time t [represented by S(u)] and the causespecific hazard for the event of interest [λ_{k}(u)] at that time. At each time point, the K cumulative incidence functions are additive to the probability of failure from any cause, 1 − S(t). The cumulative probability of event k occurring in the presence of competing risks is often incorrectly estimated by 1 − S_{k}(t), in which S_{k}(t) is calculated by the Kaplan–Meier estimator, treating events other than those due to cause k as censored. Many authors have discussed this issue specifically in the oncology context (1, 3–5).
Tests to determine differences in survival functions between groups are most often formulated as tests comparing the underlying hazard functions using some variant of the logrank test. For causespecific hazards, the same testing approach is valid (9). Similarly, the most common method to compare cumulative incidence functions is carried out via an underlying quantity that is an analogue to the hazard for the cumulative incidence function known as the subdistribution hazard (18). The subdistribution hazard can be thought of as the probability of failure due to cause k at a moment in time, given that no cause k failures have occurred thus far (with those who have failed from other causes considered among those still eventfree with respect to cause k). Several recent articles have compared these tests and discussed their use and interpretation (6–8).
Regression Models for Competing Risks Data
To summarize the effect of treatment and of patient or disease covariates in the competing risks setting, regression models can be used. As in the case of the estimators and tests just described, one must consider the metric on which to assess these covariate effects. Models for competing risks have been reviewed in detail elsewhere (15, 16), and so we briefly revisit these here.
Modeling causespecific hazards
The familiar Cox proportional hazards model is readily applied to modeling causespecific hazards (9). Denote the set of covariates by a vector Z. The model has the form
Here, the causespecific hazard is a function of some unspecified “baseline” causespecific hazard and a set of covariates. Causespecific HR (CHR) estimates from this model are largely interpreted in the same way as HRs in the absence of competing risks. However, when there is dependence between failure types, the effects may reflect the influence of competing events, sometimes in a counterintuitive way (19, 20). Also, as we will illustrate, when analyzing data via a model for the CHR, the covariate effects obtained do not necessarily pertain to the cumulative incidence of a given event type (3, 10, 15).
Modeling cumulative incidence
A similar formulation of the Cox model for hazards can be used for a quantity known as the “subdistribution hazard” (10). Heuristically, the subdistribution hazard λ*_{k} can be thought of as the hazard for an individual who either fails from cause k or does not, and in the latter case has an infinite failure time for cause k. Although this may seem unusual, indeed in the case of mutually exclusive event types, those who fail from one cause are invulnerable to failure from other causes. This mathematical construct allows relating a covariate to the cumulative incidence of a specific event type using the modelwhich is similar to the Cox model in that covariates act to multiply the baseline subdistribution hazard in a timeindependent manner. Interpretation of the subdistribution HR (SHR) is similar to that of the Cox model CHR, but there are additional important considerations that we discuss later.
A number of other approaches to regression modeling for cumulative incidence have been proposed. Because the cumulative incidence function is a function of the causespecific hazards and estimators can be “constructed” from estimates of causespecific failure hazards for the different events involved, one may opt to specify familiar Cox models for one or all of these (21, 22). Parametric distributions for one or all of the causespecific hazards can be used, and a number of such models have been proposed in the biomedical context (23–25). In another approach, Klein and Andersen used generalized linear models, which typically are restricted to data without censoring (26). This is accomplished through creation of a baseline cumulative incidence function estimate based on “pseudovalues,” and subsequent specification of a regression model that relates covariates to cumulative incidence of the given event. Fine has generalized the original subdistribution hazards model to derive a model that has some similarities with this approach (27).
Simulation Study of Competing Risks Regression Models
To understand when the results of the modeling approaches may differ, we simulated competing risks observations and computed the estimators from the models. From a bivariate survival distribution (28), we generate the time to a designated primary event X_{1} and a competing event X_{2}. We also generate an independent censoring time C. Then we obtain T = minimum(X_{1}, X_{2}, C) and an event type indicator (i.e., which time was the minimum) to form the competing risks observations. We compute the Cox CHR (9) and Fine–Gray SHR (10) for both event types. These estimates are averaged over a large number of simulated data sets to illustrate the behavior of the models under various scenarios.
Independent competing risks
For the following simulations (Table 1), correlation between failure times was set at 0, implying that failure hazards (and thus failure times) from the 2 causes are unrelated. In this instance, the CHRs that are estimable from the competing risks observations (time and event type of first failure) provide valid estimates for the “true” hazards for each event type in the bivariate distribution (known as marginal hazards). However, independence of failure times is not likely the case for many biomedical data situations, and because at most 1 of the failure times is observed in actual competing risks data, dependence among times cannot be determined (29, 30). This is the fundamental problem in competing risks that leads to the need for specialized methods.

Scenario I: In the null case in which hazards are identical between groups for both events 1 and 2, both the CHR and SHR models estimate the effects as null (ratio = 1), as expected.

Scenario II: Only event 1 is related to group membership, with the hazard rate in group B equaling twice that of group A. For the CHR model, the effects for group B versus A are as expected (CHR ≈2.0 for event 1 and CHR ≈1.0 for event 2). The SHR model gives a somewhat smaller group effect for event 1 (SHR = 1.79). However, for event 2, the SHR model indicates that group B is less likely to fail (SHR = 0.75). This happens because, even though the event 2 hazards are identical, fewer in group A fail due to event 1 and are therefore available to fail from event 2.

Scenario III: Both events are strongly influenced by group, with both hazards twice as large in group B versus group A. The CHR model estimates show these 2fold HRs. However, the effect is attenuated again in the SHR model (SHR = 1.29) because for each event type a large number of individuals fail from the competing event.

Scenario IV: Group A has substantially lower hazard for event 1 and only moderately lower hazard for event 2, and the CHR model shows these effects. The event 1 SHR is attenuated, whereas the event 2 SHR is null. This is again due to the fact that the model reflects the influence of the competing event. With relatively more individuals available to fail from event 2 in group A (because they fail less from event 1), the cumulative incidence of event 2 (and thus the SHR) is similar between groups A and B.
Dependent competing risks
We next ran simulations with moderately high positive correlation of approximately 0.60 between event times (Table 2). That is, in a given individual, if the event 1 time was small, then the event 2 time would also tend to be small and similarly for large failure times. In oncology, positive correlation between event types is plausible, as for example, patients likely to experience local regional recurrence may also be at higher risk for distant metastases. Some specific results of interest are as follows:

Scenario II: The estimated effects for the CHR and SHR model for event 1 are similar. For event 2, both models show an effect in favor of group B. Even though the actual event 2 hazards from the bivariate distribution generating the data are identical between groups, the introduction of dependence between event times causes the group effect associated with event 1 to become apparent for event 2 as well in both models.

Scenario IV: In this case with a large effect for event 1 and a smaller effect for event 2 in group A, the two models also tend to have more congruent estimates of effects than under independence.
In data analyses, when observing only the time to first event, we do not know the correlation between potential event times. Even when multiple events are observed, treatment usually changes after the first failure occurs. Thus, analyses are typically conducted on time to first event, partitioned into causes, without incorporation of any correlation that might exist among failure times.
Amount of censoring
Because the cumulative incidence functions partition the cumulative probability of failing from any event into probabilities for the specific causes, more failures of one type will naturally result in fewer eventual total failures of the other types, in the case in which all patients are observed until failure. However, as long as a sufficient proportion of patients are censored, this condition need not hold, and indeed one patient group may concurrently have greater failure probability for several competing events relative to another patient group.
Table 3 shows a simulation study under scenario II in which group B has twice the hazard for event 1 only. When there is no censoring, group B has significantly lower cumulative incidence of event 2 (SHR = 0.69) because more patients in this group have failed from event 1. As the censoring proportion increases (for example, if in a study the data were looked at earlier in time), this effect is less apparent. When 67% of patients are currently failure free (i.e., censored), there is no significant excess of event 2 failures in group B. Thus, the proportion censored can influence the estimates obtained for models based on cumulative incidence. Note that the CHR estimates are not influenced by censoring (although under dependence between event times, the CHRs can also be influenced).
In practical terms, this situation can arise in longterm followup after successful treatment. For example, in our previous article, we showed that women who received tamoxifen had greater cumulative incidence of noncancer deaths and second primary cancers (other than uterine cancer, which is significantly increased with tamoxifen use) after 12 years (13.7% vs. 10.1%) than those who received placebo (8). This observation alone does not imply a deleterious effect of tamoxifen with respect to these events, and the CHR for this endpoint indicated an 8% relative increase that was statistically nonsignificant (8).
Example from Clinical Oncology
Analysis by causes of death after treatment for localized prostate cancer
RTOG 8610 was a randomized trial comparing radiation therapy alone to radiation therapy plus androgen deprivation therapy (ADT) after curative surgery for localized prostate cancer (31). Men with prostate cancer may obtain benefit from ADT via reduction in risk of local recurrence and eventual distant metastases, which account for most prostate cancer deaths. On the other hand, depletion of hormones can have negative health consequences with respect to other diseases, thus having the potential to increase othercause mortality. Age at diagnosis is important with respect to both the growth behavior of the prostate tumor and risk for noncancer deaths. A third important covariate is tumor cell differentiation classified into grades, which is strongly related to prostate cancer death but should not have any direct bearing on noncancer deaths. Among 471 trial eligible men who participated in the trial, we examine treatment, age at diagnosis, and grade from the perspective of modeling the CHR and the cumulative incidence [using Fine–Gray and Klein–Andersen (32) models, using the software of Klein] of prostate cancer and othercause deaths.
The addition of hormones reduces the hazard of prostate cancer deaths (Table 4) by 33% (CHR = 0.67). The influence of treatment on the cumulative incidence of prostate cancer death (via the subdistribution hazard) is similar in this case (SHR = 0.66). The Klein–Andersen model estimate is also similar. For age at diagnosis, the CHR model indicates a statistically nonsignificant 11% decrease in failure risk per 10 years of increased age (partly due to some nonlinearity in the age effect). However, the cumulative incidence regression indicates a statistically significant 25% reduction in cancer deaths per 10 years of increased age (SHR = 0.75). This is likely largely due to the fact that with increasing age, risk of death from causes other than cancer increases greatly, even among cancer patients. Grade is strongly associated with prostate cancer death hazard and cumulative incidence, according to both the SHR and Klein–Andersen models.
For othercause deaths (Table 4), ADT is seen to nominally increase the hazard (CHR = 1.13), while for the cumulative incidence model, a 26% greater risk is seen that approaches statistical significance. If indeed ADT reduces prostate cancer deaths, as clearly indicated in for that endpoint, then there will naturally accumulate more othercause deaths among patients receiving it, leading to greater cumulative incidence of this event. This is an important consideration when attempting to infer the extent to which treatment is a causative factor in othercause deaths. Age at diagnosis is similarly related to the CHR of othercause death. For tumor grade, a “protective” effect with for high grade is suggested in the cumulative incidence models. However, this is most likely due to the strong association of this factor with prostate cancer death and not with any direct influence on death from other causes.
An important additional issue we did not address in this article is whether the general model form is appropriate on whichever scale one is working. In particular, the proportionality assumption cannot be simultaneously satisfied for both CHRs and SHRs, although often either model will still provide a reasonably informative summary of covariate effects (33–35).
Discussion
In this article, we discuss two popular alternative methods to exploring the influence of covariates on outcomes when there are competing risks. We showed in simulation studies and data examples that the influence of a covariate on either (i) the risk (hazard) of a specific event or (ii) the cumulative probability (incidence) of that event will differ for the same data. Thus, it is important to be aware of the outcome metric one is evaluating in relation to the covariate and interpret the result appropriately. Both approaches are correct for their respective purposes, and each may lend important information about the covariate.
Covariate effects in the CHR model described here pertain to the event of interest only, without regard to how the covariates act on competing events (9). In fact, individuals failing from other causes simply become no longer at risk for failure and are thus treated essentially as censored observations. The extent to which this model characterizes the covariate's influences solely on risk of the event of interest depends on how causes of failures may be interrelated, a quantity that is not directly observable in competing risks data. However, as the simulations show, even under moderate dependence between failure types, the CHR tends to reflect the influence of the covariate isolated to the event of interest, regardless of how that same covariate is related to competing events. On the other hand, cumulative incidence models measure the effect of the covariate on the specific event cumulative probability, and this covariate effect may derive from the direct effect on making the event of interest more or less likely to occur, or the indirect effect of making competing events more or less likely to occur. Thus, for example, patient age is related to avoidance of dying from prostate cancer through possibly less aggressive prostate disease as well as increased probability of simply dying of other causes first.
Given the choice of modeling approaches, one may ask when to use which? This depends on question(s) of interest and in many cases both may be informative. Because it is desirable to establish that, at least in part, a covariate such as treatment group or a potentially prognostic marker are indeed acting on the event of interest, CHR models might be preferred for treatment or prognostic marker effect testing. Indeed, as the simulations show, when covariate effects are “shared” among competing events, it may be the case that none achieves statistical significance when modeled on the cumulative incidence scale.
On the other hand, the effect of covariates for a given outcome, focusing largely on those who do not fail from other causes first, certainly does not yield the complete picture and indeed can be misleading. For example, the effect of hormone therapy specifically on prostate cancer death is dramatic, but othercause deaths are actually more common, and the effect of treatment on these events is null at best and has the potential to be deleterious. Thus, if one were interested in evaluating the ultimate benefit in terms of prostate cancer deaths avoided in a given population, then the model for cumulative incidence of prostate cancer more realistically models what treatment benefit is to be realized in a population. In our example, because the effect on othercause death is essentially null, results for the 2 models were similar. However, other factors, such as age at diagnosis, have the potential to dramatically alter the “yield” from such a treatment intervention, if say, it was applied in populations too old to benefit. Thus, for building prognostic nomograms or considering health economic effects of treatment, such as those of interest in comparative effectiveness research, modeling on the cumulative incidence scale is of critical importance. These models give a better sense of the influences of treatment and other covariates on an absolute scale.
Competing risks methods have been widely adopted in biomedical research and oncology applications in particular, resulting in richer interpretation of the complex event history that cancer patients undergo. Although methods continue to be developed to better answer the needs of researchers, we hope that this discussion is helpful in applying and interpreting numerous useful tools available at present and frequently seen in the oncology literature.
Disclosure of Potential Conflicts of Interest
No potential conflicts of interest were disclosed.
Grant Support
This work was supported by Public Health Service grants NCI P30CA14599 and RTOG U10 CA21661 from the National Cancer Institute, NIH, Department of Health and Human Services, and by Pennsylvania Department of Health 2009 Formula grant 4100050889.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
 Received August 19, 2011.
 Revision received January 4, 2012.
 Accepted January 17, 2012.
 ©2012 American Association for Cancer Research.