Data preprocessing for heart disease classification: A systematic literature review.

TitreData preprocessing for heart disease classification: A systematic literature review.
Publication TypeJournal Article
Year of Publication2020
AuthorsBenhar, H, Idri, A, Fernández-Alemán, JL
JournalComputer Methods and Programs in Biomedicine
Mots-clésCardiology, Classification (of information), Classification technique, classifier, clinical practice, clinical research, Computer aided diagnosis, data classification, Data mining, Data preprocessing, data processing, Decision support systems, Deep learning, Diagnosis decision, diagnostic accuracy, disease classification, Diseases, empiricism, evidence based practice, feature selection, Heart, heart disease, Heart Diseases, High dimensionality, human, Humans, intermethod comparison, Machine learning, Performance of classifier, prediction, Prediction systems, Preprocessing techniques, publication, Review, Support vector machines, Systematic literature review, Systematic Review, task performance

Context: Early detection of heart disease is an important challenge since 17.3 million people yearly lose their lives due to heart diseases. Besides, any error in diagnosis of cardiac disease can be dangerous and risks an individual's life. Accurate diagnosis is therefore critical in cardiology. Data Mining (DM) classification techniques have been used to diagnosis heart diseases but still limited by some challenges of data quality such as inconsistencies, noise, missing data, outliers, high dimensionality and imbalanced data. Data preprocessing (DP) techniques were therefore used to prepare data with the goal of improving the performance of heart disease DM based prediction systems. Objective: The purpose of this study is to review and summarize the current evidence on the use of preprocessing techniques in heart disease classification as regards: (1) the DP tasks and techniques most frequently used, (2) the impact of DP tasks and techniques on the performance of classification in cardiology, (3) the overall performance of classifiers when using DP techniques, and (4) comparisons of different combinations classifier-preprocessing in terms of accuracy rate. Method: A systematic literature review is carried out, by identifying and analyzing empirical studies on the application of data preprocessing in heart disease classification published in the period between January 2000 and June 2019. A total of 49 studies were therefore selected and analyzed according to the aforementioned criteria. Results: The review results show that data reduction is the most used preprocessing task in cardiology, followed by data cleaning. In general, preprocessing either maintained or improved the performance of heart disease classifiers. Some combinations such as (ANN + PCA), (ANN + CHI) and (SVM + PCA) are promising terms of accuracy. However the deployment of these models in real-world diagnosis decision support systems is subject to several risks and limitations due to the lack of interpretation. © 2020 Elsevier B.V.




Suivez-nous sur





Avenue Mohammed Ben Abdallah Regragui, Madinat Al Irfane, BP 713, Agdal Rabat, Maroc

  Télécopie : (+212) 5 37 68 60 78

  Secrétariat de direction : 06 61 48 10 97

        Secrétariat général : 06 61 34 09 27

        Service des affaires financières : 06 61 44 76 79

        Service des affaires estudiantines : 06 62 77 10 17 /

        Résidences : 06 61 82 89 77



    Compteur de visiteurs:483,470
    Education - This is a contributing Drupal Theme
    Design by WeebPal.