A Machine Learning Model for Prostate Cancer Prediction in Korean Men
- Authors
- 최석정; 소범기; Oh, Shane; 박홍주; 이상욱; 송기현; 이종민; 조정기; 김선혁; 이시은; 조은비; 정재흥; 김정현
- Issue Date
- Nov-2024
- Publisher
- 대한비뇨기종양학회
- Keywords
- Diagnosis; Prostatic neoplasms; Machine learning
- Citation
- Journal of Urologic Oncology, v.22, no.3, pp 201 - 210
- Pages
- 10
- Indexed
- KCI
- Journal Title
- Journal of Urologic Oncology
- Volume
- 22
- Number
- 3
- Start Page
- 201
- End Page
- 210
- URI
- https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/199859
- DOI
- 10.22465/juo.244800400020
- ISSN
- 2951-603X
2982-7043
- Abstract
- Purpose: Unnecessary prostate biopsies for detecting prostate cancer (PCa) should be minimized. Therefore, this study developed a machine learning (ML) model to predict PCa in Korean men and evaluated its usability.
Materials and Methods: We retrospectively analyzed clinical data from 928 patients who underwent prostate biopsies at Kangwon National University Hospital between May 2013 and May 2023. Of these, 377 (41.6%) were diagnosed with PCa, and 551 (59.4%) did not have cancer. For external validation, clinical data from 385 patients aged 48–89 years who underwent prostate biopsies from September 2005 to September 2023 at Wonju Severance Christian Hospital were also included. Twenty-two clinical features were used to develop an ML model to predict PCa. Features were selected based on their contributions to model performance, leading to the inclusion of 15 features. A meta-learner was constructed using logistic regression to predict the probability of PCa, and the classifier was trained and validated on randomly extracted training and test sets at an 8:2 ratio.
Results: The prostate health index, prostate volume, age, nodule on digital rectal examination, and prostate-specific antigen were the top 5 features for predicting PCa. The area under the receiver operating characteristic curve (AUC) of the meta-learner logistic regression model was 0.89, and the accuracy, sensitivity, and specificity were 0.828, 0.711, and 0.909, respectively. Our model also showed excellent prediction performance for high-grade PCa, with a Gleason score of 7 or higher and an AUC of 0.903.
Furthermore, we evaluated the performance of the model using external cohort clinical data and achieved an AUC of 0.863.
Conclusions: Our ML model excelled in predicting PCa, specifically clinically significant PCa. Although extensive cross-validation in other clinical cohorts is needed, this ML model is a promising option for future diagnostics.
- Files in This Item
-
Go to Link
- Appears in
Collections - 서울 의과대학 > 서울 비뇨의학교실 > 1. Journal Articles

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.