RISK FACTORS OF PREGNANCY LOSS USING MACHINE LEARNING ALGORITHMS
Keywords:
Bureau of statistics Punjab, K nearest neighbor (KNN), Decision tree classifier, Gaussian N.B, Support vector machines (SVM), Bernoulli N.B, Passive-Aggressive classifier, Radius Neighbors classifier (RNC), Extra tree classifier (ETC), Linear Discriminant Analysis (LDA)Abstract
Pregnancy loss, also known as spontaneous abortion, is the loss of a fetus before the 20th week of pregnancy. According to the American College of Obstetricians and Gynecologists (ACOG), around 15% to 20% of clinically diagnosed pregnancies result in pregnancy loss. We used cross-sectional data from the Bureau of Statistics Punjab (BSP) to investigate the risk factors for pregnancy loss. we compare the accuracy result of pregnancy loss data using different machine learning algorithms Logistic Regression, KNN, LDA, SVM, NB, RNC, CART, BNB, Passive, ETC to see their performance. After a comparison of the performance of the models, we found the best accuracy of the model KNN as 91%. Algorithms of LR, KNN, LDA, SVM, NB, RNC, CART, BNB, and Passive produced over 80% accuracy. Feature selection and feature importance of 28 variables identified using logistic regression, Decision tree classifier and extra trees classifier that the important features highly affecting the risk of pregnancy are total children ever born and place of delivery