Prediction of Heart Disease Using Feature Selection and Random Forest Ensemble Method
|
|
Author:
|
DHYAN YADAV, SAURABH PAL
|
Abstract:
|
The heart is very soft and sensitive part of body by which brain handles blood related system in body. The heart disease that greatly affects in body as like: pulmonary artery, atalata, enzaina and birth defects included. Heart disease is mainly related to contraction or blocked blood vessels in the heart. The symptoms of heart disease depend on the type of disease. Heart disease occurs not only in adults but also in children. The infection affecting the tissues is known as percarditis. In this, the tissues closest to the heart are affected. Infections affecting the lining of the heart muscle are known as myocardium .The study of medical datasets is made very intuitive by machine learning algorithms. The machine learning algorithms provide techniques to identify dataset attributes and the relationship between them.
In this research work, we used heart disease related information from UCI repository. The dataset contained 1025 Instances with 14 attributes, sick and nonstick patients in target variable. In this paper, we proposed and analyzed classification accuracy, precision and sensitivity by four tree based classification algorithms: M5P, random Tree and Reduced Error Pruning with Random forest ensemble method. All the prediction based algorithms have applied after the features selection of heart patient’s dataset. In this paper, we used three features based algorithms: Pearson Correlation, Recursive Features Elimination and Lasso Regularization. The data table analyzed by different feature selection methods for better prediction. All the analysis is done by three experimental setup; First experiment applied Pearson Correlation on M5P, random Tree, Reduced Error Pruning and Random forest ensemble method. In the second experiment we used Recursive Features Elimination and applied on above four tree based algorithms. In the third experiment we used Lasso Regularization and applied on as above tree based algorithms. After all the performance we analyzed and calculated classification accuracy, precision and sensitivity.
With the results, we finally concluded that feature selection methods Pearson correlation and Lasso Regularization with random forest ensemble method provide better results 99% accuracy. We analyzed and find the random forest ensemble method predicted better result compare to other algorithms in the previous year’s works.
|
Keyword:
|
Data mining Tree based Algorithms, Random Forest Ensemble Method, Features Relevant Method, Features Elimination Method Lasso Regularization Method and Heart Disease.
|
EOI:
|
-
|
DOI:
|
https://doi.org/10.31838/ijpr/2020.12.04.013
|
Download:
|
Request For Article
|
|
|