Developing a machine learning algorithm to predict prostate cancer risk in men aged 40 and above at risk of prostate cancer
Abstract
Introduction – Prostate cancer is the second most common cancer among men globally and the leading cause of cancer-related deaths in Uganda. Despite the availability of screening tools, the accuracy of prostate cancer detection remains suboptimal. This study aimed to apply machine learning algorithms to predict prostate cancer risk by utilizing clinical data from URO Care Hospital.
Method – Among the models evaluated, the Random Forest model performed best, achieving 100% accuracy, precision, and recall, outperforming the Decision Tree (98% accuracy), Logistic Regression (94% accuracy), and K-Nearest Neighbour (90% accuracy). Key predictors identified for prostate cancer included Prostate Specific Antigen levels over 10 ng/mL, patient age, and prostate volume. Symptoms such as frequent urination and weak urine flow were prevalent but did not strongly correlate with malignancy. The analysis highlighted that machine learning can effectively predict prostate cancer risk based on clinical data.
Conclusion – This study demonstrates the potential of machine learning techniques, particularly Random Forest, in accurately predicting prostate cancer risk. These models offer significant advantages in enhancing prostate cancer screening by improving diagnostic accuracy and could complement traditional screening methods in clinical practice in Uganda. The findings highlight the need for integrating machine learning models into healthcare systems to improve early detection and intervention for prostate cancer.