An empirical study to address the problem of Unbalanced Data Sets in Sentiment Classification

TitreAn empirical study to address the problem of Unbalanced Data Sets in Sentiment Classification
Publication TypeConference Paper
Year of Publication2012
AuthorsMountassir, A, Benbrahim, H, Berrada, I
Conference NamePROCEEDINGS 2012 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC)
PublisherIEEE Systems, Man, and Cybernetics Soc (SMC); IEEE; Korea Univ; Korean Soc Cognit Sci (KSCS); Korean Inst Informat Scientists and Engineers Soc Computat Intelligence (KSCI); Hi Seoul; Korea Tourism Org; Asian Off Aerosp Res and Dev (AOARD); Natl Res Fdn K
ISBN Number978-1-4673-1714-6
Abstract

With the emergence of Web 2.0, Sentiment Analysis is receiving more and more attention. Several interesting works were performed to address different issues in Sentiment Analysis. Nevertheless, the problem of Unbalanced Data Sets was not enough tackled within this research area. This paper presents the study we have carried out to address the problem of unbalanced data sets in supervised sentiment classification in a multi-lingual context. We propose three different methods to under-sample the majority class documents. These methods are Remove Similar, Remove Farthest and Remove by Clustering. Our goal is to compare the effectiveness of the proposed methods with the common random under-sampling. We also aim to evaluate the behavior of the classifiers toward different under-sampling rates. We use three different common classifiers, namely Naive Bayes, Support Vector Machines and k-Nearest Neighbors. The experiments are carried out on two Arabic data sets and an English data set. We show that the four under-sampling methods are typically competitive. Naive Bayes is shown as insensitive to unbalanced data sets. But Support Vector Machines seems to be highly sensitive to unbalanced data sets; k-Nearest Neighbors shows a slight sensitivity to imbalance in comparison with Support Vector Machines.

Revues: 

Partenaires

Localisation


Location map

Suivez-nous sur

  

Contactez-nous

ENSIAS

Avenue Mohammed Ben Abdallah Regragui, Madinat Al Irfane, BP 713, Agdal Rabat, Maroc

Résultat de recherche d'images pour "icone fax" Télécopie : (+212) 5 37 77 72 30

    Compteur de visiteurs:328,987
    Education - This is a contributing Drupal Theme
    Design by WeebPal.