Heterogeneous ensemble imputation for software development effort estimation

TitreHeterogeneous ensemble imputation for software development effort estimation
Publication TypeConference Paper
Year of Publication2021
AuthorsAbnane, I, Idri, A, Hosni, M, Abran, A
Conference NamePROMISE 2021 - Proceedings of the 17th International Conference on Predictive Models and Data Analytics in Software Engineering, co-located with ESEC/FSE 2021
Mots-clésCase based reasoning, Casebased reasonings (CBR), Decision trees, Empirical evaluations, Expectation Maximization, Forestry, Heterogeneous ensembles, Imputation techniques, K nearest neighbor (KNN), Maximum principle, Multilayer neural networks, Nearest neighbor search, Predictive analytics, Software design, Software development effort, Support vector regression, Support vector regression (SVR)

Choosing the appropriate Missing Data (MD) imputation technique for a given Software development effort estimation (SDEE) technique is not a trivial task. In fact, the impact of the MD imputation on the estimation output depends on the dataset and the SDEE technique used and there is no best imputation technique in all contexts. Thus, an attractive solution is to use more than one single imputation technique and combine their results for a final imputation outcome. This concept is called ensemble imputation and can help to significantly improve the estimation accuracy. This paper develops and evaluates a heterogeneous ensemble imputation whose members were the four single imputation techniques: K-Nearest Neighbors (KNN), Expectation Maximization (EM), Support Vector Regression (SVR), and Decision Trees (DT). The impact of the ensemble imputation was evaluated and compared with those of the four single imputation techniques on the accuracy measured in terms of the standardized accuracy criterion of four SDEE techniques: Case Based Reasoning (CBR), Multi-Layers Perceptron (MLP), Support Vector Regression (SVR) and Reduced Error Pruning Tree (REPTree). The Wilcoxon statistical test was also performed in order to assess whether the results are significant. All the empirical evaluations were carried out over the six datasets, namely, ISBSG, China, COCOMO81, Desharnais, Kemerer, and Miyazaki. Results show that the use of heterogeneous ensemble-based imputation instead single imputation significantly improved the accuracy of the four SDEE techniques. Indeed, the ensemble imputation technique was ranked either first or second in all contexts. © 2021 ACM.




Suivez-nous sur





Avenue Mohammed Ben Abdallah Regragui, Madinat Al Irfane, BP 713, Agdal Rabat, Maroc

  Télécopie : (+212) 5 37 68 60 78

  Secrétariat de direction : 06 61 48 10 97

        Secrétariat général : 06 61 34 09 27

        Service des affaires financières : 06 61 44 76 79

        Service des affaires estudiantines : 06 62 77 10 17 / n.mhirich@um5s.net.ma

        Résidences : 06 61 82 89 77



Education - This is a contributing Drupal Theme
Design by WeebPal.