Message d'état

PURL test ID: finland

Integrating external knowledge to supplement training data in semi-supervised learning for text categorization

TitreIntegrating external knowledge to supplement training data in semi-supervised learning for text categorization
Publication TypeJournal Article
Year of Publication2001
AuthorsBenkhalifa, M, Mouradi, A, Bouyakhf, H
JournalINFORMATION RETRIEVAL
Volume4
Pagination91-113
ISSN1386-4564
Abstract

Text Categorization (TC) is the automated assignment of text documents to predefined categories based on document contents. TC has been an application for many learning approaches, which prove effective. Nevertheless, TC provides many challenges to machine learning. In this paper, we suggest, for text categorization, the integration of external WordNet lexical information to supplement training data for a semi-supervised clustering algorithm which can learn from both training and test documents to classify new unseen documents. This algorithm is the ``Semi-Supervised Fuzzy c-Means{''} (ssFCM). Our experiments use Reuters 21578 database and consist of binary classifications for categories selected from the 115 TOPICS classes of the Reuters collection. Using the Vector Space Model, each document is represented by its original feature vector augmented with external feature vector generated using WordNet. We verify experimentally that the integration of WordNet helps ssFCM improve its performance, effectively addresses the classification of documents into categories with few training documents and does not interfere with the use of training data.

DOI10.1023/A:1011458711300
Revues: 

Partenaires

Localisation

Suivez-nous sur

         

    

Contactez-nous

ENSIAS

Avenue Mohammed Ben Abdallah Regragui, Madinat Al Irfane, BP 713, Agdal Rabat, Maroc

  Télécopie : (+212) 5 37 68 60 78

  Secrétariat de direction : 06 61 48 10 97

        Secrétariat général : 06 61 34 09 27

        Service des affaires financières : 06 61 44 76 79

        Service des affaires estudiantines : 06 62 77 10 17 / n.mhirich@um5s.net.ma

        Résidences : 06 61 82 89 77

Contacts

    

Education - This is a contributing Drupal Theme
Design by WeebPal.