
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
Clinical Trials |
Departments of Biomathematics [K. R. H.], Clinical Investigation [M. C. A., M. N. R.], and Gastrointestinal Medical Oncology and Digestive Diseases [R. L., J. L. A.], The University of Texas M. D. Anderson Cancer Center, 1515 Holcombe Boulevard, Houston, Texas 77030.
| ABSTRACT |
|---|
|
|
|---|
| INTRODUCTION |
|---|
|
|
|---|
The majority of UPC patients, however, fall outside of these subsets, and their prognoses are much more difficult to predict. Because of the complex presentations of patients with UPC, clinicians often experience difficulty applying standard statistical methods to assess the interactions between clinical variables, determining the cumulative effect of these variables on survival, and translating this information into appropriate management. Therefore, the goals of this study were to apply a novel statistical method to UPC patients to: (a) identify novel prognostic factors; (b) explore the interactions between clinical variables and their impact on survival; and (c) illustrate explicitly how the covariates interact.
To achieve these goals, the technique of CART analysis was explored. This method uses recursive partitioning to assess the effect of specific variables on survival, thereby ultimately generating groups of patients with similar clinical features and survival times. The partitioning of patients into groups with differing survival times using clinical variables generates a tree-structured model that can be analyzed to assess its clinical utility. A default tree generated from the unmanipulated recursive partitioning algorithm and two alternative trees exploring the effects of alternative partitioning schemes were generated and analyzed.
| PATIENTS AND METHODS |
|---|
|
|
|---|
Of 1609 patients referred with a diagnosis of UPT, 148 were excluded from further analysis based on the criteria outlined above. Thus, from this initial group, 1461 patients were identified with suspected UPTs. From this group of 1461 patients, 81 patients had no evidence of cancer, leaving 1380 patients with suspected UPTs. Of the 1380 patients referred with a suspected diagnosis of UPT, a primary tumor was identified in 380 patients using the diagnostic evaluation previously defined by our group (12) , leaving a total of 1000 UPC patients for evaluation.
Twenty-six clinical variables were analyzed within the following general categories: demographic variables, pathological variables, tumor burden, and involvement of specific metastatic sites (Table 1)
.
|
The number of involved metastatic organ sites were counted for each patient to provide a crude estimate of tumor burden. In most instances, a positive biopsy was obtained from the most accessible metastatic site, and additional sites of involvement were documented by a physical exam or radiography. For this analysis, a single metastatic organ site was considered to be involved, even if there were multiple individual metastases within that site. Recommendations for therapy were based on the availability of active investigational protocols and the current medical literature (1 , 9) .
Statistical Methods.
Patient survival was measured from the time of diagnosis as established by the date of the initial biopsy, and the survival distribution was estimated using the product limit method of Kaplan and Meier (13)
. Median survival time was computed as the time when the Kaplan-Meier estimate crossed 50%. Confidence limits for the median were computed as the times when the CIs for the Kaplan-Meier estimate crossed 50%. Multivariate analyses of survival were performed using Cox proportional hazards regression analysis (14)
, and recursive partitioning was referred to as CART. CART analysis was also used to identify optimal cut points in the data and was implemented using a method suggested by Therneau et al. (15)
. In this method, the censored survival data are transformed into a single uncensored data value (the so-called "null martingale residual"), which is used as input into a standard regression tree algorithm (16)
. This ad hoc method has been shown to perform reasonably well for censored time-to-event data (17)
. The size of the reported trees was determined based on the results of repeated 10-fold cross-validation (16)
. In addition to the default tree generated by the CART algorithm, we examined alternative initial splits using systematic inspection (18)
. Simulations were also computed to assess the frequency of alternative splits (19)
. A restriction was imposed on the tree construction such that terminal subgroups resulting from any given split must have at least 20 patients. Hazard ratios and corresponding CIs and Ps were computed using the Cox model (14)
. Analyses were performed using S-PLUS software (Version 3.3, Statistical Sciences, Seattle, WA).
| RESULTS |
|---|
|
|
|---|
|
|
CART Analysis.
The overall survival curve for all 1000 consecutive UPC patients is displayed in Fig. 1
. The median survival was 11 months (95% CI, 1012 months), with only 11% (95% CI, 914%) surviving at 5 years. CART was performed using 26 clinical variables as described in the "Patients and Methods" section. Each trees structure depended on the initial split of the patients. A default tree was generated by allowing the CART program to determine the variable with the optimal first split, and two alternative trees were explored through a systematic inspection of alternative splits (burling) and bootstrapping. The results for trees generated on 500 bootstrap samples indicated that liver involvement was chosen as the initial split with a probability of 41%, histology was selected with a 27% probability, and lymph node involvement was selected with a 23% probability. The next highest probability was 3%.
|
|
|
A second alternative tree was created with the initial split on lymph node involvement (Fig. 4)
. The structure of this tree was quite distinct from either the default tree or the first alternative tree. For this tree, the best survival (Fig. 5)
was in a subgroup of 99 (9.9%) UPC patients with lymph node involvement, one or two total organ sites involved, and nonadenocarcinoma histology (median survival, 45 months; 95% CI not reached). The subgroup with the shortest survival included 117 (11.7%) UPC patients (group 9) with nonneuroendocrine liver metastases but without lymph node involvement (median survival, 5 months; 95% CI, 47 months) and 39 patients with adrenal metastases, >2 involved organ-sites, and lymph node metastases (median survival, 5 months; 95% CI, 48 months).
|
|
Similar interactions were observed in the second alternative tree initially split on lymph node involvement. Among the 582 patients without lymph node involvement, the hazard ratio for metastatic sites >2 was 1.3 (95% CI, 1.01.7), with P = 0.031. Among the 418 patients with involvement of lymph node sites, the hazard ratio for metastatic sites >2 was 2.1 (95% CI, 1.72.7), with P < 0.0001. Thus, the effect of the number of sites was more pronounced in patients with lymph node involvement.
| DISCUSSION |
|---|
|
|
|---|
The challenges presented by the UPC population highlight two related but distinct goals in prognostic factor studies: (a) to identify the covariate structure (e.g., find independent prognostic factors); and (b) to identify prognostic subgroups. CART marries these objectives nicely by constructing subgroups directly on the covariates. In our previous work identifying prognostic factors that influence UPC survival, we relied on Cox univariate and multivariate analyses (2) . Although we clearly identified important prognostic factors, we experienced problems with the bedside utility of this type of data. The principal difficulty was that because UPC patients presented with variable patterns of good and bad prognostic factors, it was difficult to use the Cox-based data to estimate survival for an individual patient. This made it important and challenging to integrate the available prognostic information into patient management.
The analyses conducted in this study demonstrated that the variables reported to be important using the Cox univariate and multivariate technique were consistently applied by the CART program to segregate patients into groups with similar clinical features and survival. For example, we previously reported that clinical variables, such as hepatic involvement, number of metastatic organ sites, lymph node involvement, and tumor histology were statistically significant independent prognostic factors (2) . In each of the three trees generated, these variables were used by the program to generate the best splits of patients into groups with differing survival times. In other instances, clinical variables previously reported through univariate analysis to be statistically correlated with survival (such as bone metastases or age) were similarly used by the CART algorithm to generate groups of patients with differing survival times. The fact that each of these approaches used similar clinical variables to stratify patient survival confirms their clinical importance and supports the validity of the CART analysis.
Interestingly, each CART analysis identified a subset of patients with adrenal metastases that experienced very poor survival. In addition, the default tree (initial split on liver involvement) and alternative tree 1 (initial split on pathology) identified a subset of patients with pleural metastases with a median survival of 9 months. These subsets of UPC patients have not been previously described, suggesting that CART was also able to identify novel patient subsets that may require special treatment strategies.
Although this analysis did not specifically seek to compare CART to other prognostic factor methodologies, an advantage for CART is that it can identify prognostic subgroups that are clinically useful because they are based on simple combinations of clinical characteristics. In contrast to traditional regression methods (e.g., Cox proportional hazards regression), which compute a prognostic index as a weighted average of the patients characteristics (i.e., an algebraic formula), CART constructs groups based on logical combinations of patient characteristics. Thus, the prognostic subgroups are based directly rather than indirectly on the patient characteristics. Another advantage is the simple, intuitive nature of the CART algorithm (i.e., find the best split by examining all possible splits in all available variables, form subgroups based on this split, repeat in each subgroup). Understanding the essential elements of this process does not require great statistical sophistication, yet the trees often capture much of the relevant covariate structure of the data, including complex interactions and nonlinearities that traditional methods can only handle with much effort. Because it recursively looks for covariate structure within patient subgroups, local covariate effects (i.e., when a covariate has a certain prognostic relationship in one patient subgroup but other relationships in other subgroups) can be easily identified. For example, among the 331 patients with liver metastases (right branch on first split on default tree), a split on pathology was performed with neuroendocrine patients split off from the others. However, for the last split on the left (227 patients without liver, bone, adrenal, or pleural metastases and only 1 or 2 total metastatic sites), pathology was used for a split in which the adenocarcinoma patients were split from the other patients. Thus, the pathology variable was used in different ways in different parts of the tree.
Two negative aspects of CART deserve mention: (a) because its algorithm performs hundreds of statistical comparisons during the construction of the tree, P values that may be computed comparing identified subgroups are difficult to interpret (i.e., the overall type I error rate is corrupted by all of the preliminary comparisons). Thus, before accepting this model, validation must be performed on an independent data set. (b) CART may not capture modest, global linear effects because it must approximate the linear effect with a series of splits (i.e., a step function), and quite likely, the individual splits would not be statistically significant.
Finally, CART is a simple method for dissecting complex clinical issues such as those presented by UPC patients. By using CART in our practice with UPC patients, we are able to rapidly develop an estimate of the survival probability of an individual patient based simply on clinical features that are readily apparent on completion of the work-up. Future clinical trials of patients with UPC should prospectively examine the ability of the prognostic information obtained from CART to facilitate precise clinical decision-making. Further, these data can be used to identify relatively homogeneous UPC patient populations with similar survival times for analysis of novel therapeutic interventions. This technique may also be readily applicable to other complex clinical data sets.
| FOOTNOTES |
|---|
1 Supported in part by a grant from the University Cancer Foundation, The University of Texas M. D. Anderson Cancer Center. Presented in part at the American Society of Clinical Oncology annual meeting in Philadelphia, PA, May, 1996. ![]()
2 To whom requests for reprints should be addressed, at Department of Gastrointestinal Medical Oncology and Digestive Diseases, The University of Texas M. D. Anderson Cancer Center, 1515 Holcombe Boulevard, Box 78, Houston, Texas 77030. Phone: (713) 792 2828; Fax: (713) 745-1163; E-mail: jabbruzz{at}notes.mdacc.tmc.edu ![]()
3 The abbreviations used are: UPC, unknown primary carcinoma; CART, classification and regression tree; UPT, unknown primary tumor; CI, confidence interval. ![]()
Received 5/ 3/99; revised 8/18/99; accepted 8/19/99.
| REFERENCES |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
A. Levy, C. Massard, M. Gross-Goupil, and K. Fizazi Carcinomas of an unknown primary site: a curable disease? Ann. Onc., September 1, 2008; 19(9): 1657 - 1658. [Full Text] [PDF] |
||||
![]() |
C. K. Kuhl Current Status of Breast MR Imaging * Part 2. Clinical Applications Radiology, September 1, 2007; 244(3): 672 - 691. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. D. Hainsworth, D. R. Spigel, C. Farley, D. S. Thompson, D. L. Shipley, and F. A. Greco Phase II Trial of Bevacizumab and Erlotinib in Carcinomas of Unknown Primary Site: The Minnie Pearl Cancer Research Network J. Clin. Oncol., May 1, 2007; 25(13): 1747 - 1752. [Abstract] [Full Text] [PDF] |
||||
![]() |
V. A. Valera, B. A. Walter, N. Yokoyama, Y. Koyama, T. Iiai, H. Okamoto, and K. Hatakeyama Prognostic Groups in Colorectal Carcinoma Patients Based on Tumor Cell Proliferation and Classification and Regression Tree (CART) Survival Analysis Ann. Surg. Oncol., January 1, 2007; 14(1): 34 - 40. [Abstract] [Full Text] [PDF] |
||||
![]() |
P Conaghan, M A D'Agostino, P Ravaud, G Baron, M Le Bars, W Grassi, E Martin-Mola, R Wakefield, J-L Brasseur, A So, et al. EULAR report on the use of ultrasonography in painful knee osteoarthritis. Part 2: Exploring decision rules for clinical utility Ann Rheum Dis, December 1, 2005; 64(12): 1710 - 1714. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. S. Nathan, J. H. Healey, D. Mellano, B. Hoang, I. Lewis, C. D. Morris, E. A. Athanasian, and P. J. Boland Survival in Patients Operated on for Pathologic Fracture: Implications for End-of-Life Orthopedic Care J. Clin. Oncol., September 1, 2005; 23(25): 6072 - 6082. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. A. Greco, G. I. Rodriguez, D. W. Shaffer, R. Hermann, S. Litchy, D. A. Yardley, H. A. Burris III, L. H. Morrissey, J. B. Erland, and J. D. Hainsworth Carcinoma of Unknown Primary Site: Sequential Treatment with Paclitaxel/Carboplatin/Etoposide and Gemcitabine/Irinotecan: A Minnie Pearl Cancer Research Network Phase II Trial Oncologist, November 1, 2004; 9(6): 644 - 652. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. A. Koziol, J.-Y. Zhang, C. A. Casiano, X.-X. Peng, F.-D. Shi, A. C. Feng, E. K. L. Chan, and E. M. Tan Recursive Partitioning as an Approach to Selection of Immune Markers for Tumor Diagnosis Clin. Cancer Res., November 1, 2003; 9(14): 5120 - 5126. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Balana, J.-L. Manzano, I. Moreno, B. Cirauqui, A. Abad, A. Font, J.-L. Mate, and R. Rosell A phase II study of cisplatin, etoposide and gemcitabine in an unfavourable group of patients with carcinoma of unknown primary site Ann. Onc., September 1, 2003; 14(9): 1425 - 1429. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Culine, A. Kramar, M. Saghatchian, R. Bugat, T. Lesimple, A. Lortholary, Y. Merrouche, A. Laplanche, and K. Fizazi Development and Validation of a Prognostic Model to Predict the Length of Survival in Patients With Carcinomas of an Unknown Primary Site J. Clin. Oncol., December 15, 2002; 20(24): 4679 - 4683. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| Cancer Research | Clinical Cancer Research |
| Cancer Epidemiology Biomarkers & Prevention | Molecular Cancer Therapeutics |
| Molecular Cancer Research | Cancer Prevention Research |
| Cancer Prevention Journals Portal | Cancer Reviews Online |
| Annual Meeting Education Book | Meeting Abstracts Online |