Multi-Machine Learning Binary Classification, Feature Selection and Comparison Technique for Predicting Death Events Related to Heart Disease
|
|
Author:
|
RITU AGGRAWAL, SAURABH PAL
|
Abstract:
|
The ‘cardiovascular disease’ (CVD) refers to any heart disease, vascular disease or venous disease. Deaths due to CVD: As pointed out by the World Health Organization, the annual death toll continues to exceed 17.9 million. Some ongoing machine learning research publications have shown the utility of feature selection algorithms in machine learning activities. This article reports the positive and exact results of these changes and argues for their true thinking in comparable learning activities. To achieve this goal, six models are proposed, all of which apply 8 machine learning classifiers (such as LR, DT, SVM, LDA, QDA, RF, KNN and NB) over models. With the estimation of accuracy, other indicators such as precision, recall rate, F1 score, support score, AUC/ROC have been calculated to support the model. The six selected models are 1.Model without dimensionality reduction (with all features), 2. Correlation coefficients score model, 3.Voting (hard + soft) classifier model, 4. Linear SVC + select from model, 5. Linear svc + RFECV model, 6. Tree based feature classifier (svc + ET) model. Among all these six model, the highest accuracy gain by linear discriminant Analysis (80.61%) in model 1, 83.17 % by Random Forest in model 2, 84.10 % accuracy by random forest again in model 4, 83.12% accuracy by linear discriminant analysis in model 5, 83.05 % accuracy by Logistic regression in model 6. By voting classifier (hard and soft = 76.66 %) accuracy gained. Finally, we compared the applicability of all these models to find deaths caused by heart disease.
|
Keyword:
|
Machine Learning, Cardiovascular Disease, Classifiers, Model Accuracy, voting classifier.
|
EOI:
|
-
|
DOI:
|
https://doi.org/10.31838/ijpr/2021.13.01.080
|
Download:
|
Request For Article
|
|
|