EVALUATING MACHINE LEARNING AND DEEP LEARNING MODELS FOR EARLY BREAST CANCER DIAGNOSIS
Keywords:
Breast cancer detection, deep learning, machine learning, neural networks, diagnostic accuracy, predictive modelingAbstract
Breast cancer is the most diagnosed cancer in different parts of the world and the mortality rate is major stress being attached to the importance of early and accurate diagnosis. This study presents a comprehensive comparison between traditional machine learning techniques—including Support Vector Classifier (SVC), Decision Tree (DT), and Random Forest (RF)—and a deep learning-based neural network (NN) for breast cancer prediction. Based on a large heterogeneous clinical dataset of 5,200 patients, we added 24 demographic, genetic, and lifestyle factors such as age, BRCA-1 mutation status, mammograms, BMI, smoking habits, etc. The preprocessing of the data was strict k-NN was used to obtain values of missing values, Min-Max normalization as well as SMOTE oversampling to cover the problem of class imbalance. Our results demonstrate the superior performance of the NN model, achieving 93.0% accuracy, 0.98 precision, and 0.92 F1-score, outperforming SVC (88.36%), DT (86.18%), and RF (86.90%). It is important to note that the NN model showed a decrease in no-remitting rate by 22 percent over RF, showing its potential to be utilized in the diagnosis of early stages. Genetic mutations and BMI were found as important predictors through feature importance analysis, which goes along well with the clinical wisdom. This research does not only confirm the effectiveness of deep learning in diagnosing breast cancer but also provides a reproducible infrastructure for making it available in mainstream clinical practice, and it can inspire feasible recommendations to decrease the time delay in the diagnosis of breast cancer and enhance patient outcomes. One of the future directions of the current research is developing the multimodal data fusion and federated learning to promote the higher diagnostic accuracy and privacy-preserving collaborations.