Clinical Cancer Research Prevention Award Metabolism
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
Cancer Research Clinical Cancer Research
Cancer Epidemiology Biomarkers & Prevention Molecular Cancer Therapeutics
Molecular Cancer Research Cancer Prevention Research
Cancer Prevention Journals Portal Cancer Reviews Online
Annual Meeting Education Book Meeting Abstracts Online

Clinical Cancer Research 13, 4440, August 1, 2007. doi: 10.1158/1078-0432.CCR-06-2958
© 2007 American Association for Cancer Research

This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Van Holsbeke, C.
Right arrow Articles by Timmerman, D.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Van Holsbeke, C.
Right arrow Articles by Timmerman, D.

Imaging, Diagnosis, Prognosis

External Validation of Mathematical Models to Distinguish Between Benign and Malignant Adnexal Tumors: A Multicenter Study by the International Ovarian Tumor Analysis Group

Caroline Van Holsbeke1,4, Ben Van Calster3, Lil Valentin7, Antonia C. Testa8, Enrico Ferrazzi9, Ioannis Dimou5, Chuan Lu6, Philippe Moerman2, Sabine Van Huffel3, Ignace Vergote1 and Dirk Timmerman1

Authors' Affiliations: Departments of 1 Obstetrics and Gynecology, and 2 Pathology, University Hospitals KU Leuven, 3 Department of Electrical Engineering, ESAT-SCD, Katholieke Universiteit Leuven, Leuven, Belgium; 4 Department of Obstetrics and Gynecology, Ziekenhuis Oost-Limburg, Genk, Belgium; 5 Department of Electronics and Computer Engineering, Technical University of Crete, Chania, Greece; 6 Department of Computer Science, University of Wales, Aberystwyth, United Kingdom; 7 Department of Obstetrics and Gynecology, Malmö University Hospital, Malmö, Sweden; 8 Istituto di Clinica Ostetrica e Ginecologica, Università Cattolica del Sacro Cuore, Rome, Italy; and 9 DCS Sacco, Università di Milano, Milan, Italy

Requests for reprints: Dirk Timmerman, Department of Obstetrics and Gynecology, University Hospitals, KU Leuven, Herestraat 49, B-3000 Leuven, Belgium. Phone: 32-1634-4201; Fax: 32-1634-4205; E-mail: dirk.timmerman{at}uz.kuleuven.ac.be.


    Abstract
 Top
 Abstract
 Materials and Methods
 Results
 Discussion
 Appendix A
 References
 
Purpose: Several scoring systems have been developed to distinguish between benign and malignant adnexal tumors. However, few of them have been externally validated in new populations. Our aim was to compare their performance on a prospectively collected large multicenter data set.

Experimental Design: In phase I of the International Ovarian Tumor Analysis multicenter study, patients with a persistent adnexal mass were examined with transvaginal ultrasound and color Doppler imaging. More than 50 end point variables were prospectively recorded for analysis. The outcome measure was the histologic classification of excised tissue as malignant or benign. We used the International Ovarian Tumor Analysis data to test the accuracy of previously published scoring systems. Receiver operating characteristic curves were constructed to compare the performance of the models.

Results: Data from 1,066 patients were included; 800 patients (75%) had benign tumors and 266 patients (25%) had malignant tumors. The morphologic scoring system used by Lerner gave an area under the receiver operating characteristic curve (AUC) of 0.68, whereas the multimodal risk of malignancy index used by Jacobs gave an AUC of 0.88. The corresponding values for logistic regression and artificial neural network models varied between 0.76 and 0.91 and between 0.87 and 0.90, respectively. Advanced kernel-based classifiers gave an AUC of up to 0.92.

Conclusion: The performance of the risk of malignancy index was similar to that of most logistic regression and artificial neural network models. The best result was obtained with a relevance vector machine with radial basis function kernel. Because the models were tested on a large multicenter data set, results are likely to be generally applicable.


There is a need for adequate preoperative assessment of an adnexal mass, because the presumed diagnosis determines the management. Functional cysts and some benign cysts may be treated conservatively. In benign cysts, laparoscopic treatment may decrease hospitalization length and surgical morbidity (13). However, if an adnexal mass is likely to be malignant, the patient should be referred to a gynecologic oncologist for surgical staging and optimal debulking. Spilling of an early stage ovarian cancer during laparoscopy or suboptimal debulking with residual disease may worsen the prognosis (47).

In most cases, experienced sonologists could correctly determine the character of an adnexal mass on the basis of their own subjective impression. They may achieve a sensitivity with regard to malignancy of >95% and a specificity of ~90% (8, 9). Less experienced sonologists could be helped by scoring systems or mathematical models that can be used to discriminate between benign and malignant adnexal masses.

A whole series of prediction models (scoring systems, logistic regression models, and artificial neural networks) have previously been developed (10). Most of them have not been tested prospectively on a large population. Therefore, their generalizibility is unknown.

The aim of this study was to prospectively validate the performance of published scores and mathematical models to predict malignancy in an adnexal mass by applying the models to the data collected in a large multicenter study by the International Ovarian Tumor Analysis (IOTA) group (11).


    Materials and Methods
 Top
 Abstract
 Materials and Methods
 Results
 Discussion
 Appendix A
 References
 
Data set
The data set used was recruited during phase I of the IOTA study. This was the first study from the IOTA group, a prospective multicenter study done in nine centers from five different countries (11). The aim of the IOTA study was to collect sonographic and demographic data of >1,000 patients with an adnexal mass to develop mathematical models to predict malignancy in an ovarian mass. All details have been previously described (11). The primary outcome was the histologic classification of the excised tissue as malignant or benign. Data from 1,066 patients were available for analysis. Serum CA125 values were known for 809 (75.9%) of the 1,066 patients. Models that included CA125 were tested only on these 809 patients. Overall, 800 (75%) tumors were benign and 266 (25%) were malignant.

Selection of scores and mathematical models to predict malignancy
The selection of the old models was partly done by literature search. We selected models for which the statistical formula was published, provided that all the variables used in the formulae were also analyzed in phase I of the IOTA study. In this way, 17 models were selected: three scoring systems [two risk of malignancy indices (RMI and RMI2) and Lerner's scoring system; refs. 1214], seven logistic regression models (1521), three artificial neural network models (19, 20), and four least squares support vector machine (LS-SVM) and relevance vector machine (RVM) models (20, 22). LS-SVMs and RVMs are advanced mathematical models that are also highly capable of handling nonlinear data. They are believed to perform well when used prospectively and are briefly described below.

The variables used in the selected scoring systems and models are shown in Table 1 . The following models/scores did not fulfill the inclusion criteria: the scoring systems of DePriest et al. (23), Ferrazzi et al. (24), Alcazar et al. (25), and Sassone et al. (26); the logistic regression model of Alcazar et al. (27); and the neural networks from Biagiotti et al. (28), Clayton et al. (29), Bishop (30), and Tailor et al. (31). The reasons for exclusion are listed in the Supplementary Table that is published online.


View this table:
[in this window]
[in a new window]

 
Table 1. Scores and models tested, variables used in the scores and models, and cutoffs of the respective scores and models suggested in the original report

 
LS-SVMs. SVMs are learning machines that, using a kernel function, nonlinearly transform the input space into a high dimensional feature space. In this high dimensional feature space, a linear classifier is constructed. Depending on the type of transformation (linear or nonlinear kernel), this linear classifier in the feature space coincides with either a linear or nonlinear classifier in the original input space. The SVM looks for a linear classifier in the feature space by maximizing the margin between the data of both classes (leading to simpler models). Therefore, SVMs look for a trade-off between minimizing complexity and minimizing the number of misclassifications. The resulting model is sparse in that only a few cases from the training set were used to construct the decision boundary. These cases are called the "support vectors" and are usually located on or close to the decision boundary. The LS-SVM is a variant of the standard SVM in which training of the model is greatly simplified (20, 32). Lu et al. developed a linear (using a linear kernel) and a nonlinear (using a radial basis function or RBF kernel to transform the input data) LS-SVM model in a Bayesian framework (20, 22).

RVMs. RVMs were inspired by the SVM model formulation, but because they do not work with a feature space, there are less restrictions with respect to valid transformations of the input space (32). Therefore, they are still fundamentally different from SVMs. RVM models use a Bayesian perspective and the resulting model is again sparse (33). However, the cases used for constructing the decision boundary were prototypical examples of both classes rather than cases that were close to the decision boundary. Lu et al. developed a linear and a nonlinear (using an RBF kernel) RVM model (20, 22). The principles of logistic regression analysis, artificial neural networks, SVMs, and RVMs are described in refs. (20, 30, 3237).

Modification of variables
Some variables used in some scores or models lacked an exact corresponding variable in the IOTA database. The IOTA variables that we used in these cases are shown in the Supplementary Tables published online.

Statistical analysis
Statistical analyses were carried out using Statistical Analysis Software version 9.1 (SAS Institute, Inc.). The performance of the models and scores were compared with respect to their area under the receiver operating characteristic (ROC) curve (AUC). After applying the cutoff value to predict malignancy suggested in the original publication, the sensitivity, specificity, positive predictive value, negative predictive value, positive likelihood ratio (LR+), and negative likelihood ratio (LR–) for each score/model could be calculated. We used the cutoffs reported in the original articles, although the optimal cutoff was highly dependent on the population characteristics, e.g., the amount of malignant and exceptional tumors. Because the sensitivity and related measures were dependent on the cutoff value chosen, whereas the AUC was not, we considered the AUC to be the most important measure of diagnostic performance (38, 39). The nonparametric procedure described in ref. (38) was used to test whether the AUCs of two models differed. To compare all 17 scores/models with each other, we used a ranking method based on the work of Pepe et al. (40). All models were ranked with respect to a chosen criterion. This ranking, however, is influenced by sampling variability. This variability is accounted for by computing the probability that a method is ranked among the {kappa} best methods: Pm({kappa}) (ref. 40). One thousand bootstrap samples were drawn using the IOTA phase I data, which have CA125 to allow the comparison of all models. For each bootstrap sample, the criterion was computed for each of the 17 models and the models were ranked within each bootstrap from good to bad. This allowed us to compute how many times each model was ranked among the best 5 and best 10 in order to obtain Pm(5) and Pm(10). As already mentioned, the main criterion used was the AUC. However, based on the work of Pepe et al., we also looked at the partial AUC (pAUC; ref. 40). The pAUC or the partial AUC gives you the area under a part of the ROC curve with a minimum acceptable specificity. We believe that the minimum acceptable specificity level when predicting the malignancy of ovarian tumors is 75%; therefore, we computed the partial AUC (0.75) for that part of the ROC curve in which the specificity is at least 75% (i.e., the most left part).


    Results
 Top
 Abstract
 Materials and Methods
 Results
 Discussion
 Appendix A
 References
 
The diagnostic performance of the scores and models tested is shown in Tables 2 and 3 and in Fig. 1 . The scoring system that did best when applied on the IOTA data was the RMI of Jacobs et al. (12) with an AUC of 0.88. The AUCs of the scores ranged from 0.66 to 0.88 (Tables 2 and 3; Fig. 1A). The AUCs of the logistic regression models ranged from 0.72 to 0.91 with the logistic regression model of Lu et al. (20) performing best (AUC, 0.91; Tables 2 and 3; Fig. 1B). All neural networks did well, their AUCs ranging from 0.88 to 0.89, and so did the LS-SVMs and RVMs with AUCs of 0.91 to 0.92 (Tables 2 and 3; Fig. 1C).


View this table:
[in this window]
[in a new window]

 
Table 2. Diagnostic performance of scores and models tested

 

View this table:
[in this window]
[in a new window]

 
Table 3. Sensitivity, specificity, positive and negative predictive values, and positive and negative likelihood ratios for all models when using the cutoff value suggested in the original report in comparison with the originally reported sensitivity and specificity for the test set patients

 

Figure 1
View larger version (12K):
[in this window]
[in a new window]
[Download PPT slide]
 
Fig. 1. A, AUC of two RMIs and of the Lerner score when tested on the 809 patients with available CA125 results. B, AUC of seven logistic regression models when tested on the 809 patients with available CA125 results. C, AUC of two Bayesian LS-SVM and two RVM when tested on the 809 patients with available CA125 results. D, AUC of the best diagnostic model per category when tested on the 809 patients with available CA125 results. For the scoring systems, the best score is the RMI (12); for the neural networks, this is the Bayesian MLP (multi-layer perceptron) from Lu et al. (20); for the logistic regression models, the model from Lu et al. (20) and for the vector machine models, the RVM with RBF kernel from Lu et al. (20, 22).

 
Figure 1D shows the best performing model per category, the RVM with RBF kernel of Lu et al. (22) having the highest AUC (0.923). Table 4 shows the 17 scores and models ranked from high to low AUC. The five best performing models were the RVMs, the LS-SVMs, and the logistic regression model from Lu et al. (20, 22). By using the ranking method from Pepe et al. (40), the five best performing models were the same (Table 4). The probability of being among the best five models, Pm(5), or the best 10 models, Pm(10), is much higher for these models than for any other model. The other criterion used, the pAUC(0.75), also ranked these five models at the top (Table 4).


View this table:
[in this window]
[in a new window]

 
Table 4. All 17 scoring systems and mathematical models ranked from highest to lowest AUC

 

    Discussion
 Top
 Abstract
 Materials and Methods
 Results
 Discussion
 Appendix A
 References
 
This is the first report on prospective testing of multiple mathematical models to predict malignancy in an adnexal mass from a large and multicenter database (4143). External validation (i.e., prospective testing of a model on a new database in another center) is the ultimate test a model has to go through before it can be used in clinical practice. Internal validation or testing the performance of the model on a test set of data from the same center is less adequate because it does not give information on the generalizibility of the model. The advantage of the IOTA database is that data were collected in different centers with different referral patterns resulting in a diverse study population mimicking a more universal population.

Most models tested on the IOTA database did worse than was originally reported. For an excellent model, the AUC should reach 0.90 (35). Only the logistic regression model of Lu and colleagues (20), the LS-SVM with linear and RBF kernel, and the RVM with linear and RBF did so (20, 22).

There are several plausible reasons why the performance in the original articles was better than the results that we report. One is that some variables in some scores and models lacked an exact corresponding variable in the IOTA database, so that the variables had to be redefined on the basis of the information available in the IOTA database. This was true for both RMIs, the Lerner score, and the Prömpeler logistic regression model. Second, one of the causes of poorer performance in a prospective test are the differences in population characteristics between the data from the original study and the data from the prospective study. If the model development was based on a small sample of tumors, the particularities of that specific sample will have great influence on the model. However, Table 3 shows the population size in the original reports and we could not find a linear correlation between the size of the data set in the original report and the performance of the score or the model. The prevalence of malignancy in a population and the mix of exceptional tumors and borderline tumors will characterize a population. The prevalence of malignancy, for example, is 25% in the IOTA phase I study, but when we look at the original reports of the models that were tested, it varied between 22% and 42%. Models that have been developed on a large and multicenter study sample, representative of a general population, are likely to be more robust. Therefore, a model developed in a tertiary referral center might perform less when it is applied in a regional gynecology center. Because the IOTA data set is a multicenter data set collected in hospitals from different countries and with different referral pattern, it is more likely to represent a "universal" population.

A third explanation for models incorporating ultrasound variables performing worse when tested prospectively, is that the results of some ultrasound variables may be difficult to reproduce, in particular, those including an element of subjectivity, e.g., regular versus irregular cyst wall and those highly dependent on the examination technique adopted, e.g., Doppler variables. Results of spectral Doppler ultrasound examinations depend on the tumor vessel investigated, velocity variables like peak systolic velocity and time-averaged maximum velocity are angle-dependent, and estimation of the color content of the tumor scan to assess tumor vascularity is subjective. This may make scores and models incorporating Doppler variables difficult to reproduce.

In the IOTA database, all variables had been defined using strict criteria, but these criteria may not have corresponded exactly to those used when creating the score or model to be tested. The models that did best in this study, all had been developed on data that had—at least partly—been collected by examiners who also collected data for the IOTA study, and all had been created at institutions that contributed data to the IOTA study. This means that the variables used in these models were probably defined similarly when the models were created and tested, and that the tumor populations in which the models were created and tested were also similar. This may partly explain the apparent superior performance of these models. When using the AUC as a measure of diagnostic performance, the performances of the RMI of Jacobs et al. (12) and the artificial neural network models from Timmerman et al. (19) were surprisingly similar (AUC, 0.883-0.876-0.889), even with the very best mathematical model, the maximum difference in AUC was only 0.04 (0.883 versus 0.923). Also, in other studies, the RMI of Jacobs has been shown to perform very well when tested prospectively (13, 18, 44, 45). The RMI was developed on a relatively small population (143 patients) but its robustness may be explained by the inclusion of variables that did not require high ultrasound skills except for the diagnosis of metastatic lesions (CA125, menopausal status, tumor with any other ultrasound morphology than a unilocular cyst, ascites, bilateral lesions, and presence of metastases at a scan). On the other hand, it requires the knowledge of the serum CA125 level, which is expensive and time-consuming and not always available at the time the scan is done. The RMI is one of the oldest multimodal scoring systems that uses the combination of a tumor marker and ultrasound variables.

When we used the cutoffs that were reported in the original articles, the RBF kernel model picked up another 51 additional malignancies in comparison with the RMI (218 of 242 versus 167 of 242 malignant masses) due to higher sensitivity (Table 4). Furthermore, when using the ranking method of Pepe et al. (40), the RMI was not listed in the top five best performing models.

Finally, the statistical development procedure of the models is of utmost importance. The final model depends on the selection of the different variables. The goodness of fit is important because it implies the accuracy with which the final regression model describes the data (46). Although the AUC represents the quality of a test and is very useful to compare the performance of different models, the selection of cutoff levels is dependent on the study population and the preference of the investigator: if a high sensitivity is chosen, few malignancies will be missed, whereas increasing the cutoff will lead to a higher specificity and thus fewer false-positive results. It is interesting to note that the sensitivities and specificities of the scores/models when using the cutoffs that had been recommended in the original publications were not very good, and they were much worse than in the original publications. It is well known that the optimal cutoff point of any test will depend on the proportion of cancers in the study population; therefore, if one uses the same cutoff in a population with a significantly different amount of malignancies, one can expect that the model will perform worse (Table 3).

We conclude that the AUC of most logistic regression and artificial neural network models was similar. The best results were obtained with vector machine models. The use of these complex mathematical models resulted in the correct classification of a significant amount of additional malignancies. Because the models were tested on a large multicenter data set, results are likely to be generally applicable.


    Appendix A
 Top
 Abstract
 Materials and Methods
 Results
 Discussion
 Appendix A
 References
 
IOTA Steering Committee

Dirk Timmerman, Lil Valentin, Thomas H. Bourne, William P. Collins, Sabine Van Huffel, and Ignace Vergote.

IOTA principal investigators (in alphabetical order)

Jean-Pierre Bernard, Maurepas, France
Thomas H. Bourne, London, United Kingdom
Enrico Ferrazzi, Milan, Italy
Davor Jurkovic, London, United Kingdom
Fabrice Lécuru, Paris, France
Andrea Lissoni, Monza, Italy
Ulrike Metzger, Paris, France
Dario Paladini, Naples, Italy
Antonia Testa, Rome, Italy
Dirk Timmerman, Leuven, Belgium
Lil Valentin, Malmö, Sweden
Caroline Van Holsbeke, Leuven, Belgium
Sabine Van Huffel, Leuven, Belgium
Ignace Vergote, Leuven, Belgium
Gerardo Zanetta, Monza, Italy.


    Footnotes
 
Grant support: This research was supported by interdisciplinary research grants of the Katholieke Universiteit Leuven, Belgium (IDO/99/03 and IDO/02/09 projects), by the Belgian Programme on Interuniversity Poles of Attraction, by the Concerted Action Project AMBioRICS of the Flemish Community, and by the EU Network of excellence BIOPATTERN into the IST program, entitled "Computational Intelligence for Biopattern Analysis in Support of eHealthcare" (contract no. FP6-2002-IST 508803), and by research grants from the Swedish Medical Research Council (K2001-72X-11605-06A, K2002-72X-11605-07B, K2004-73X-11605-09A, and K2006-73X-11605-11-3), by funds administered by the Malmö General Hospital Foundation for the fight against cancer, and a Swedish governmental grant from the region of Scania.

The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

Note: Supplementary data for this article are available at Clinical Cancer Research Online (http://clincancerres.aacrjournals.org/).

Received 12/15/06; revised 3/ 5/07; accepted 4/ 2/07.


    References
 Top
 Abstract
 Materials and Methods
 Results
 Discussion
 Appendix A
 References
 

  1. Buchweitz O, Matthias S, Muller-Steinhardt M, Malik E. Laparoscopy in patients over 60 years old: a prospective randomized evaluation of laparoscopic versus open adnexectomy. Am J Obstet Gynecol 2005;193:1364–8.[CrossRef][Medline]
  2. Medeiros LR, Fachel JM, Garry R, Stein AT, Furness S. Laparoscopy versus laparotomy for benign ovarian tumours. Cochrane Database Syst Rev 2005;20:CD004751.
  3. Carley ME, Klingele CJ, Gebhart JB, Webb MJ, Wilson TO. Laparoscopy versus laparotomy in the management of benign unilateral adnexall masses. J Am Assoc Gynecol Laparosc 2002;9:321–6.[CrossRef][Medline]
  4. Vergote I, De Brabanter J, Fyles A, Bertelsen K, Einhorn N, Sevelda P. Prognostic importance of degree of differentiation and cyst rupture in stage I invasive epithelial ovarian carcinoma. Lancet 2001;357:176–82.[CrossRef][Medline]
  5. Hacker NF, Berek JS, Lagasse LD. Primary cytoreductive surgery for epithelial ovarian cancer. Obstet Gynecol 1983;61:431–20.
  6. Sutton GP, Stehman FB, Einhorn LH. Ten-year follow-up of patients receiving cisplatin, doxorubicin, and cyclophosphamide chemotherapy for advanced epithelial ovarian carcinoma. J Clin Oncol 1989;7:223–9.[Abstract]
  7. Redman JR, Petroni GR, Saigo PE. Prognostic factors in advanced ovarian carcinoma. J Clin Oncol 1986;4:515–23.[Abstract/Free Full Text]
  8. Timmerman D, Schwärzler P, Collins WP, Claerhout F, Coenen M, Amant F. Subjective assessment of adnexal masses with the use of ultrasonography: an analysis of interobserver variability and experience. Ultrasound Obstet Gynecol 1999;13:11–6.[CrossRef][Medline]
  9. Valentin L. Prospective cross-validation of Doppler ultrasound examination and gray-scale ultrasound imaging for discrimination of benign and malignant pelvic masses. Ultrasound Obstet Gynecol 1999;14:273–83.[CrossRef][Medline]
  10. Timmerman D. The use of mathematical models to evaluate pelvic masses: can they beat an expert operator? Best Pract Res Clin Obstet Gynaecol 2004;18:91–104.[CrossRef][Medline]
  11. Timmerman D, Testa AC, Bourne T, Ferrazzi E, Ameye L, Konstantinovic ML; International Ovarian Tumor Analysis Group. Logistic regression model to distinguish between the benign and malignant adnexal mass before surgery: a multicenter study by the International Ovarian Tumor Analysis Group. J Clin Oncol 2005;23:8794–801.[Abstract/Free Full Text]
  12. Jacobs I, Oram D, Fairbanks J, Turner J, Frost C, Grudzinskas J. A risk of malignancy index incorporating CA 125, ultrasound and menopausal status for the accurate preoperative diagnosis of ovarian cancer. BJOG 1990;97:922–9.[CrossRef]
  13. Tingulstad S, Hagen B, Skjeldestad FE, Onsrud M, Kiserud T, Halvorsen T. Evaluation of a risk of malignancy index based on serum CA 125, ultrasound findings and menopausal status in the preoperative diagnosis of pelvic masses. BJOG 1996;103:826–31.[CrossRef]
  14. Lerner JP, Timor-Tritsch IE, Federman A, Abramovich G. Transvaginal ultrasonographic characterization of ovarian masses with an improved, weighted scoring system. Am J Obstet Gynecol 1994;170:81–5.[Medline]
  15. Minaretzis D, Tsionou C, Tziortziotis D, Michalas S, Aravantinos D. Ovarian tumors: prediction of the probability of malignancy by using patient's age and tumor morphologic features with a logistic model. Gynecol Obstet Invest 1994;38:140–4.[Medline]
  16. Tailor A, Jurkovic D, Bourne T, Collins WP, Campbell S. Sonographic prediction of malignancy in adnexal masses using multivariate logistic regression analysis. Ultrasound Obstet Gynecol 1997;10:41–7.[CrossRef][Medline]
  17. Prömpeler HJ, Madjar H, Sauerbrei W, Lattermann U, Pfleiderer A. Diagnostic formula for the differentiation of adnexal tumors by transvaginal sonography. Obstet Gynecol 1997;89:428–33.[CrossRef][Medline]
  18. Timmerman D, Bourne T, Tailor A, Collins WP, Verrelst H, Vandenberghe K. A comparison of methods for preoperative discrimination between malignant and benign adnexal masses: The development of a new logistic regression model. Am J Obstet Gynecol 1999;181:57–65.[CrossRef][Medline]
  19. Timmerman D, Verrelst H, Bourne TH, De Moor B, Collins WP, Vergote I. Artificial neural network models for the preoperative discrimination between malignant and benign adnexal masses. Ultrasound Obstet Gynecol 1999;13:17–25.[CrossRef][Medline]
  20. Lu C, Suykens JAK, Timmerman D, Vergote I, Van Huffel S. Linear and nonlinear preoperative classification of ovarian tumors. Chapter 11, Knowledge based intelligent system for health care. In: Ichimura T, Yoshida K, editors. Vol. 7, International Series on Advanced Intelligence. Magill, Australia: Advanced Knowledge International; 2004. p. 343–82.
  21. Jokubkiene L, Sladkevicius P, Valentin L. Does three-dimensional power Doppler ultrasound help in discrimination between benign and malignant ovarian masses? Ultrasound Obstet Gynecol 2007;29:215–25.[CrossRef][Medline]
  22. Lu C, Van Gestel T, Suykens JAK, Van Huffel S, Vergote I, Timmerman D. Preoperative prediction of malignancy of ovarium tumor using least squares support vector machines. Artif Intell Med 2003;28:281–306.[CrossRef][Medline]
  23. DePriest PD, Varner E, Powell J, Fried A, Puls L, Higgins R. The efficacy of a sonographic morphology index in identifying ovarian cancer: a multi-institutional investigation. Gynecol Oncol 1994;55:174–8.[CrossRef][Medline]
  24. Ferrazzi E, Zanetta G, Dordoni D, Berlanda N, Mezzopane R, Lissoni G. Transvaginal ultrasonographic characterization of ovarian masses: a comparison of five scoring systems in a multicenter study. Ultrasound Obstet Gynecol 1997;10:192–7.[CrossRef][Medline]
  25. Alcazar JL, Mercé L, Laparte C, Jurado M, Lopez-Garcia G. A new scoring system to differentiate benign from malignant adnexal masses. Am J Obstet Gynecol 2003;188:685–92.[CrossRef][Medline]
  26. Sassone AM, Timor-Tritsch I, Artner A, Westhoff C, Warren W. Transvaginal sonographic characterization of ovarian disease: evaluation of a new scoring system to predict ovarian malignancy. Obstet Gynecol 1991;78:70–6.[Medline]
  27. Alcazar JL, Jurado M. Using a logistic model to predict malignancy of adnexal masses based on menopausal status, ultrasound morphology and color Doppler findings. Gynecol Oncol 1998;69:146–50.[CrossRef][Medline]
  28. Biagiotti R, Desii C, Vanzi E, Gacci G. Predicting ovarian malignancy: application of artificial neural networks to transvaginal and color Doppler flow US. Radiology 1999;210:399–403.[Abstract/Free Full Text]
  29. Clayton RD, Snowden S, Weston MJ, Mogensen O, Eastaugh J, Lane G. Neural networks in the diagnosis of malignant ovarian tumours. BJOG 1999;106:1078–82.[CrossRef]
  30. Bishop CM. Neural networks for pattern recognition. Oxford University Press; 1996.
  31. Tailor A, Jurkovic D, Bourne TH, Collins WP, Campbell S. Sonographic prediction of malignancy in adnexal masses using an artificial neural network. BJOG 1999;106:21–30.[CrossRef]
  32. Suykens J, Van Gestel T, De Brabanter J, De Moor B, Vandewalle J. Least square support vector machines. Singapore: World Scientific.
  33. MacKay D. Probable networks and plausible predictions—a review of practical Bayesian methods for supervised neural networks. Network: Computation in Neural Systems 1995;6:469–505.[CrossRef]
  34. Aslam N, Banerjee S, Carr J, Savvas M, Hooper R, Jurkovic D. Prospective evaluation of logistic regression models for the diagnosis of ovarian cancer. Obstet Gynecol 2000;96:75–80.[CrossRef][Medline]
  35. Hosmer DW, Lemeshow S. Applied logistic regression. Wiley Series in Probability and Statistics. New York; 2000.
  36. Vapnik VN. Statistical learning theory. New York: Wiley; 1998.
  37. Tipping ME. Sparse Bayesian learning and the relevance vector machine. J Mach Learn Res 2001;1:211–44.[CrossRef]
  38. DeLong ER, DeLong DM, Clarke-Pearson DL. Comparing the areas under two or more correlated receiver operating characteristic curves. A nonparametric approach. Biometrics 1988;44:837–45.[CrossRef][Medline]
  39. Hanley JA, McNeil B. The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology 1982;143:29–36.[Abstract/Free Full Text]
  40. Pepe M, Longton G, Anderson G, Schummer M. Selecting differentially expressed genes from microarray experiments. Biometrics 2003;59:133–42.[CrossRef][Medline]
  41. Mol BW, Boll D, De Kanter M, Heintz P, Sijmons E, Oei G. Distinguishing the benign and malignant adnexal mass: an external validation of prognostic models. Gynecol Oncol 2001;80:162–7.[CrossRef][Medline]
  42. Valentin L. Comparison of Lerner score, Doppler ultrasound examination, and their combination for discrimination between benign and adnexal masses. Ultrasound Obstet Gynecol 2000;15:143–7.[CrossRef][Medline]
  43. Valentin l, Hagen B, Tingulstad S, Eik-Nes S. Comparison of "pattern recognition" and logistic regression models for discrimination between benign and malignant pelvic masses: a prospective cross validation. Ultrasound Obstet Gynecol 2001;18:357–65.[CrossRef][Medline]
  44. Davies AP, Jacobs I, Wolas R, et al. The adnexal mass: benign or malignant? Evaluation of a risk of malignancy index. BJOG 1993;100:927–31.[CrossRef]
  45. Morgante G, la Marca A, Ditto A, De Leo V. Comparison of two malignancy risk indices based on serum CA125, ultrasound score and menopausal status in the diagnosis of ovarian masses. BJOG 1999;106:524–7.[CrossRef]
  46. Khan K, Chien P, Dwarakanath L. Logistic regression models in obstetrics and gynecology literature. Obstet Gynecol 1999;93:1014–20.[CrossRef][Medline]



This article has been cited by other articles:


Home page
Clin. Cancer Res.Home page
C. Van Holsbeke, B. Van Calster, A. C. Testa, E. Domali, C. Lu, S. Van Huffel, L. Valentin, and D. Timmerman
Prospective Internal Validation of Mathematical Models to Predict Malignancy in Adnexal Masses: Results from the International Ovarian Tumor Analysis Study
Clin. Cancer Res., January 15, 2009; 15(2): 684 - 691.
[Abstract] [Full Text] [PDF]


Home page
J Ultrasound MedHome page
P. Yoruk, O. Dundar, B. Yildizhan, L. Tutuncu, and T. Pekin
Comparison of the Risk of Malignancy Index and Self-Constructed Logistic Regression Models in Preoperative Evaluation of Adnexal Masses
J. Ultrasound Med., October 1, 2008; 27(10): 1469 - 1477.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Van Holsbeke, C.
Right arrow Articles by Timmerman, D.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Van Holsbeke, C.
Right arrow Articles by Timmerman, D.


HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
Cancer Research Clinical Cancer Research
Cancer Epidemiology Biomarkers & Prevention Molecular Cancer Therapeutics
Molecular Cancer Research Cancer Prevention Research
Cancer Prevention Journals Portal Cancer Reviews Online
Annual Meeting Education Book Meeting Abstracts Online