Home | Articles
Published on:October 2020
Indian Journal of Pharmaceutical Education and Research, 2020; 54(3s):s705-s715
Original Article | doi:10.5530/ijper.54.3s.171

Machine Learning-Based Decision Support System for Early Detection of Breast Cancer

Authors and affiliation (s):

Mochen Li1,*, Gaurav Nanda1, Santosh Chhajedss2, Raji Sundararajan1

1School of Engineering Technology, Purdue University, Grant St. West Lafayette, USA.

2MET's Institute of Pharmacy, Bhujbal Knowledge City, Nashik, Maharashtra, INDIA.


Background: Breast cancer is one of the leading causes of death of women in the United States and also one of the most malignant cancer among women worldwide. Early, more accurate detection of breast cancer enables extended longevity at a reduced cost. Towards this, analyzing the available big data using tools, such as Machine learning-based decision support systems can improve the speed and accuracy of early detection of breast cancer. In this paper, we examined the prediction performance of various state-of-theart machine learning models and a decision support system based on these models that provided the predicted category along with a prediction confidence measure. Methods: The various machine learning (ML) algorithms applied include Decision Tree, Naïve Bayes, k-Nearest Neighbors (kNN) and Support Vector Machine (SVM). We also analyzed the effect of multiple feature selection approaches on the prediction performance. We used the Breast Cancer Wisconsin Dataset from Wisconsin Prognostic Breast Cancer (WPBC) with 569 digitized images of a fine needle aspirate (FNA) of breast mass and 10 realvalued feature information. The performance of the ML model was evaluated using the ten-fold cross-validation approach and also on a prediction set comprising of 20% data with the models trained on remaining 80% data. Sensitivity and Specificity were used as the primary measures of performance. Results: Among all five machine learning methods, SVM had the best performance. Except for the kNN algorithm, the performance of the other three algorithms, Logistic Regressions, Naïve Bayes and Decision Trees, were also quite close to SVM. The prediction performance of the decision support system was better than any individual ML model where the prediction confidence was “High” or “Medium”. Conclusion: We found that feature selection improved the performance and computation cost for all ML models. By building the ML-based decision support system with the optimal feature subset, the prediction performance for breast cancer can be improved to 96% which means it can provide powerful assistance to doctors and patinets. On the other hand, as the size of the data set increases, the processing of data with a lot of features can increase the computation cost as well as the possibility of classification errors.

Key words: Breast cancer, Data analysis, Machine learning, Feature selection, Decision support system.



Impact Factor

IJPER - An Official Publication of Association of Pharmaceutical Teachers of India is pleased to announce continued growth in the 2019 Release of Journal Citation Reports (source: 2018 Web of Science Data).


Impact Factor® as reported in the 2018 Journal Citation Reports® (Clarivate Analytics, 2019): 0.425

User login

The Official Journal of Association of Pharmaceutical Teachers of India (APTI)
(Registered under Registration of Societies Act XXI of 1860 No. 122 of 1966-1967, Lucknow)

Indian Journal of Pharmaceutical Education and Research (IJPER) [ISSN-0019-5464] is the official journal of Association of Pharmaceutical Teachers of India (APTI) and is being published since 1967.


IJPER uses reference linking service using Digital Object Identifiers (DOI) by Crossref. Articles from the year 2013 are being assigned DOIs for its permanent URLs