Abstract
Previously, we have shown that serial measurements of prostatespecific antigen (PSA) in hormonerefractory prostate cancer (HRPC) can be used to calculate an average relative velocity (rva) of PSA. Together, the level of PSA and the rva formed a twovariable model for survival time that worked at any time during the course of HRPC. Here, we have added serial measurements of hemoglobin and weight to test whether they improve the prior model based on PSA alone. Data from two Cancer and Leukemia Group B studies (9181 and 9182) on HRPC were combined to study the relationship between survival and serial measurements of PSA, serum hemoglobin, and patient weight. Altogether, there were 348 patients who could be evaluated. We used the Cox proportional hazard model for survival time with the interval censored method to accommodate timedependent covariates, and tests for significance were two sided. Log (PSA), rva, log (hemoglobin), and log [weight (in kg)] were all significantly related to survival time during the course of HRPC (P < 3.0 × 10^{−5}). Together, they formed a prognostic score based upon the relative hazard. Higher values of this score implied higher probability of death as the next observed event. Serial measurements of PSA, hemoglobin, and weight provide a prognostic score that can be applied continuously during the course of HRPC. Changes in the score may provide a reproducible measure of treatment effect.
INTRODUCTION
In men with HRPC,3 there is a broad spectrum in the severity of disease with the attendant variability in levels of PSA, symptoms, and potential for survival. To determine where a given patient fits in this spectrum, we need a measure or prognostic score. Such a score would also allow patients and physicians to select appropriate therapy (i.e., secondary hormonal treatment versus cytotoxic chemotherapy versus hospice), and it would help clinical investigators control biases in Phase II or III studies. Finally, if the score was determined repeatedly in followup, it might provide an early measure of treatment effect. In a prior study, we showed that the level of PSA and the relative velocity of PSA provided a PSAbased prognostic score that related closely to survival at any time during followup of HPRC (1) . In this report, we extend the previous model to include more patients and additional serial measurements of serum hemoglobin and patient weight. We show how the result provides an improved prognostic score for men with HRPC.
PATIENTS AND METHODS
The prostate cancer committee of the CALGB has completed two studies in men with HRPC. In the first protocol (CALGB 9181), 149 men were randomly assigned to receive either lowdose (160 mg/day) or highdose (640 mg/day) MA. In the second (CALGB 9182), 239 men were randomly assigned to receive either a low dose of hydrocortisone (40 mg per day) or the combination of the same dose of hydrocortisone plus mitoxantrone at 14 mg/m^{2}, which was given i.v. once every 21 days, as tolerated, for a maximum of 10 doses. Signed informed consent was obtained from all of the patients, and the study was approved by all CALGB participating member institutional review boards. Both protocols required that blood be drawn and assayed for PSA within the 2week period before entry and at 3–4week intervals thereafter. Table 1<$REFLINK> summarizes and compares critical values for these two groups of men, and it shows that the two groups were similar. The main results for these two studies have been reported in abstract form (2, 3, 4) . Preliminary survival analyses indicated that there was no difference between the arms of each study (P > 0.6). Fig. 1<$REFLINK> shows that the overall survival for the two groups was nearly identical.
PSA Relative Velocity.
As we have done previously (1) , we defined the relative velocity of PSA as: with y and t symbolizing PSA and time, respectively. For any given time interval between times t1 and t2, we estimate the relative velocity to be: where y1 denotes the value of PSA at t1 and y2 is the value at t2. Throughout, “log” refers to the natural logarithm, that is, logarithm to the base e, which is ∼2.72. Thus, the relative velocity for any time interval is just the slope of the log (PSA) versus time curve at that interval. The rva was obtained by averaging these values over separate time intervals up to the time t1 of each interval.
Relative Hazard Exponent as a Prognostic Score.
In traditional analysis of survival time, the hazard function is defined as the conditional probability that a patient’s survival time will fall in a time interval, given that the patient has survived up to the beginning time of the interval (5) . The commonly used Cox model assumes that this hazard, here symbolized as h(t), is proportional to a baseline hazard ho(t), which is often designated as the hazard for a patient with mean values of the prognostic variables. The application we see most commonly confines its interest to prognostic variables observed at the beginning of a study, and treatment is added to the list. We will symbolize these prognostic variables as x1, x2, … xn. However, to develop a model that could be applied serially throughout the course of disease, we use prognostic variables that can change with time. In this case the prognostic variables become x1(t), x2(t), … xn(t), and the (t) indicates that they change with time. Thus, with this approach, the hazard function h(t) can be written in equation form as:
Here, we will designate the linear combination in the exponent of Eq. C as the “prognostic score.” Because this score is mathematically related to the hazard function, it is also related to the probability of dying. Because this score is a linear combination of several prognostic variables, it is multivariate.
Statistical Methods.
We used two statistical models: the Cox proportional hazard model (5, 6, 7, 8, 9) and the general linear model (10) . We used the Cox model first to identify which variables were prognostic and to estimate the coefficients of Eq. C. Then we used the general linear model to test how the prognostic score related to treatment. For the Cox model analysis, we used the coxph program of the SPLUS software (6) , and to adapt the analysis to the serial values of PSA, hemoglobin, and weight as a timedependent variables, we used the “interval censored” or “counting process” approach (7, 8, 9) . Time intervals for patients were defined by the time points for sequential pairs of clinic visits, and only the prognostic variables available at the beginning of each interval were used to relate to the outcome at the end of the interval. All tests of significance were two sided.
RESULTS
Validation of rva of PSA.
Because the number of patients we used to define and study the rva of PSA in our previous study was limited to just 133, we sought first to validate that result with 233 additional patients entered onto CALGB study 9182. Table 2<$REFLINK> shows the results of two Cox proportional hazard analyses, the first on protocol 9181 and the second on protocol 9182. The Ps in Table 2<$REFLINK> demonstrate that the rva of PSA was significantly related to survival in both studies, although the estimated values of the coefficient changed with the patients on protocol 9182. It dropped from 24.48 to 15.54. Because the SE from the first study was so large (i.e., 7.32), this change with additional patients is not surprising. Thus, whereas on new data the rva continued to be an important prognosticator, its importance relative to log (PSA) was less, and this required an adjustment in its coefficient.
Importance of Serum Hemoglobin and Patient Weight.
Table 3<$REFLINK> shows the results of a Cox proportional hazard analysis for a fourvariable survival model that adds serial measurements of hemoglobin and patient weight to the model of Table 2<$REFLINK> . The combined patients from both protocols were used in this analysis, but because there were times when neither hemoglobin nor weight was recorded, the total number of patients was limited to just 348, with 241 uncensored. Table 2<$REFLINK> demonstrates that serial measures of these four variables related significantly to survival time (Ps ranging from 1.2 × 10^{−15} to 2.1 × 10^{−5}). The model using the logarithms of hemoglobin and weight was better than one using hemoglobin and weight without log transforms, and a restricted model on this same group of patients using just log (PSA) and rva of PSA produced a lower likelihood ratio of 65.2 compared to the 154 of the fourvariable model. Thus, hemoglobin and patient weight added significant prognostic information to that provided from PSA alone. Because its residuals showed no trend with the time of followup, the model seemed to work for any time in the followup period.
Using the coefficients in Table 3<$REFLINK> (with hgb representing hemoglobin and wgt representing weight), we can write the prognostic score for the fourvariable model at time t as:
Fig. 2<$REFLINK> shows how this score related to the probability of death as the next observed event in our study patients. As the score increased above zero, the probability of death as the next observed event increased. The relationship was, however, a probabilistic one; i.e., there is no certainty of outcome regardless of the score. For example, the confluent points at 0 indicate when no death occurred, and the confluent points at 1 indicate when there was an observed death. Because for most values of the score, there were some who died and some who survived, the score provided the prognosis for the average patient but not for each individual. See “Appendix” for further details on how to calculate the prognostic score for an example of patient data.
Effect of Experimental Treatment on the Prognostic Score.
Because one of the more important uses of a prognostic score in HRPC would be as an early measure of treatment effect, we tested how the treatments of our two studies affected this score. The top portion of Table 4<$REFLINK> shows the results for the treatment of MA in protocol 9181, and the bottom portion shows the result for the treatment of mitoxantrone in protocol 9182. Rather than arbitrarily choosing a single time to measure the effect of treatment, we searched for an overall treatment effect overall of the 200 days. We chose 200 days because this was the time during treatment and because it satisfied the requirement of an early measure of response; i.e., one before final survival information was available. To control for an effect of time on the prognostic score that was independent of treatment, we introduced time as linear and nonlinear variables. For protocol 9181, for example, the prognostic score was dependent on time as a linear variable (P = 0.023). For protocol 9182, there was an additional parabolic time effect on the prognostic score; i.e., there was first a fall in scores followed by a rise. Thus, for protocol 9182, we had to include time as a linear variable as well as a squared variable (i.e., time^{2}), and these two variables were significant (P = 0.0003 and 0.018, respectively).
Thus, Table 4<$REFLINK> shows how the prognostic score was affected by treatment after controlling for the effects of time. In the top portion, we see that dosage of MA was not significantly associated with any change in prognostic score, either overall or with time (P = 0.57 and 0.18, respectively). Although one might consider the P of 0.18 for dose × time variable borderline, the magnitude of the coefficient (0.002) suggests that higher dose has minimal influence on the prognostic score over time. Fig. 3<$REFLINK> shows the estimate of mean values of prognostic scores over time for those on low dose versus high dose of MA for this protocol. Although the higher dose of MA has a lower curve as time increases, the analysis indicated that this trend was not significant (P = 0.18). In the bottom portion of Table 4<$REFLINK> , we see the analysis of the effect of combined mitoxantrone and hydrocortisone on the prognostic score. The P of 0.0096 for the first mitoxantrone term implies that the prognostic score was significantly higher in those treated with mitoxantrone. On average, those treated with mitoxantrone had scores 0.0135 greater than those not treated with mitoxantrone. The lack of significance for the mitroxantrone × time and mitoxantrone × time^{2} variables implies that this effect was constant over the 200day period. Because we were unable to demonstrate any significant differences in PSA, hemoglobin, or weight between the two treatment groups at the outset of the study (P > 0.2), the result suggests that the increase in hazard score was most likely due to mitoxantrone. Fig. 4<$REFLINK> demonstrates this difference visually by showing the estimated mean values of multivariate prognostic scores over time for the mitoxantrone treated group (Mitoxantrone +HC) versus the group treated with hyrdrocortisone (HC Alone) alone.
DISCUSSION
This study demonstrates how sequential measurements of PSA, hemoglobin, and weight form a prognostic score that is closely related to survival. Four features distinguish our prognostic score. (a) Mathematically, it is part of the hazard function, which, in turn, ties it closely to survival. (b) It can be calculated and updated at any time in the course of HRPC. Specifically, our score differs from many prognostic scores, because it comes from serial data collected at the beginning of study as well as throughout followup. Thus, our score is suitable for repeated use during followup, and it is this feature that makes it a candidate for measuring treatment effect. (c) It includes a measure of the dynamics of PSA, namely, the time rva. (d) It goes beyond PSA with two additional variables: hemoglobin and weight.
In our study population, when the normalized prognostic score was over 2, the probability that the next observed event was death was ∼0.4. Nevertheless, regardless of the level of prognostic score, there were some who died and some who survived, so the score predicts for an average patient. Furthermore, until this score is validated by study of another population, we do not know how accurately it will perform on patients outside our two protocols or how it will perform as a measure of treatment response. The results are, however, sufficient to establish the importance of serial measurements of PSA, hemoglobin, and weight for both prognosis and response, and they suggest that further studies of HRPC include these serial measures. It may also be important to document the prognostic score at the beginning of studies of HRPC, so that pretreatment bias may be reduced.
Unlike commonly used measures of response in HRPC, the prognostic score is not a binary measure but a continuous one, and all four of its variables are used as continuous ones. In this way, the score is like PSA, tumor volume, tumor growth rate, or even survival time, all of which are continuous. Whereas it may be traditional to document treatment effects as binary (present or absent), or as tertiary (complete, partial, and absent), we have also expressed effectiveness of treatment in terms of either survival time or diseasefree interval. Both of these are continuous. Because PSA is continuous, imposing discrete cutoff points on PSA or on changes in PSA is unnecessary. For example, the commonly used percentage decreases in PSA at any time are discrete cutoff points imposed on the continuous relative velocity of PSA. We can see this below.
If we symbolize the values of PSA at two times (t1 and t2) as y1 and y2 and the percentage decrease in PSA as “PD,” then the relative velocity of PSA (“rv”) can be written as:
Thus, an 80% decline in PSA at 60 days corresponds to a relative velocity of approximately −0.027, and a 50% decline at 60 days corresponds to a relative velocity of −0.012. In Fig. 5<$REFLINK> , the curved lines illustrate this function, and they show the continuous relationship between relative velocity and percentage decline for two different time points: 4 weeks (28 days) and 60 days. As the percentage decline increases, the relative velocity decreases. The point to this analysis, however, is to suggest that the variable and traditional choices of PSA response in terms of specific percentage declines or landmark times are just points taken from a natural continuum of relative velocity of PSA. Our results imply that, instead of such arbitrary points, we should use a continuous measure of treatment effect, just as we have often relied on survival times and diseasefree intervals.
The practice of defining therapeutic response based on PSA alone has been criticized for several reasons (11, 12, 13, 14, 15, 16) . PSA response has not always related to survival (12) . Treatment can change PSA without affecting tumor growth (14 , 16) . PSA response has been variably and inconsistently defined (15) . PSA response does not always correlate with response in measurable disease (11 , 17) , and finally, there can be wide temporal fluctuations in PSA (11) . Our results indicate that rva can improve the use of PSA by capturing dynamic aspects while avoiding being overly influenced by either a particular level of PSA or by fluctuations in PSA because rva has units of 1/time (rather than ng/ml) and because rva is a time average of relative velocities. In fact, in our results rva was more closely associated with survival time than was the level of PSA, and after hemoglobin, rva was the variable most closely associated with survival time. Finally, because our prognostic score uses two nonPSA variables, it can avoid some of the difficulties of a response based solely on the level of PSA.
Hemoglobin is a prognosticator that reflects not only the general health of the patient but also the degree of tumor displacement of the marrow. As a prognosticator for HRPC, hemoglobin is not new. For example, hemoglobin, hematocrit, and the designation of anemia have been found to prognostic in numerous studies (17, 18, 19, 20, 21, 22, 23, 24, 25, 26) , and its prognostic value has persisted in multivariate analysis (17 , 19 , 20 , 25) . Whereas the level of hemoglobin has often been categorized as either low or normal, in this study, we found that hemoglobin was best used as a continuous variable. Furthermore, as a repeated measurement during followup the natural logarithm of hemoglobin was more closely associated with survival than was the untransformed value of hemoglobin, and in terms of Ps, it was the variable most closely associated with survival time. Although hemoglobin is correlated with performance status, its measurement is more objective than performance status. Nevertheless, it is possible that serial measurements of performance status could improve on the prognostic score, especially if, as Smith et al. (17) have suggested, there is some interaction between hemoglobin and performance status. Because we did not have serial measurements of performance status, we could not evaluate this interaction.
Like hemoglobin, body weight is a prognosticator that reflects the general health of the patient, and it has been reported to be prognostic before (21 , 24) . Mostly, what has been emphasized has been degree of weight loss, but in our analysis, we found that as a timedependent variable the natural logarithm of absolute weight was prognostic, and correcting for the initial onstudy weight did not significantly improve this model.
In summary, we have formed a prognostic score based on the level of PSA, the time average of relative velocity of PSA, the level of serum hemoglobin, and body weight. The score is closely related to survival time, so that it may be a surrogate measure of response, but proving this will require further work. For example, we need to demonstrate that the score has a significant association with an effective treatment for HRPC. Specifically, when a univariate Cox model analysis produces a significant treatment effect, we should be able to show that, in a bivariate Cox model, the score can nullify the treatment effect (27) . Alternatively, when the treatment arms produce a significant difference in the score, then the arms should also produce a significant difference in survival. In this study, we found a small but significant difference in the score for the two treatment arms of protocol 9182, but the average difference in the score for these two arms was just 0.0135. A quick examination of Fig. 2<$REFLINK> shows that such a small difference in score is not likely to translate into much of a survival difference. Thus, it is not surprising that preliminary analyses have shown no significant difference in survival for the two treatment arms of protocol 9182.
Appendix 1
Example of Prognostic Score Calculation
Calculation of average relative velocity of PSA (rva) was performed as follows:
The prognostic score was estimated as follows: <$REFLINK>
The estimated probability of death as the next observed event was 0.05.
The usual median for the score is 0, and the usual range is from −3 to +3.
In the calculation we have substituted our population’s means for averages of log (PSA), rva, log (hemoglobin), and log (weight).
A computer program is available to perform these calculations. It is designed to operate on an IBMcompatible computer, and it calculates the prognostic score from the current values of PSA, hemoglobin, and weight, together with five prior values of PSA with their respective dates. It also estimates the probability of death as the next observed event, assuming the patient behaves as an average member of our studied population.4
Footnotes

The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

↵1 Supported by NIH Grants CA47577 (to R. T. V.), CA26806 (to N. A. D.), and CA41287 (to N. J. V.).

↵2 To whom requests for reprints should be addressed, at Laboratory Medicine (113), Veterans Affairs Medical Center, Durham, NC 27705. Phone: (919) 2860411; Fax: (919) 2866818.

↵3 The abbreviations used are: HRPC, hormonerefractory prostate cancer; PSA, prostatespecific antigen; CALGB, Cancer and Leukemia Group B; MA, megestrol acetate; rva, average relative velocity.

↵4 R. T. V. will provide this program to readers who mail a 3.5inch disk to the address for reprint requests.
 Accepted January 7, 1999.
 Received October 14, 1998.
 Revision received January 6, 1999.