A Brief Survey of Text Mining: Classification, Clustering and Extraction Techniques

Allahyari, Mehdi; Pouriyeh, Seyedamin; Assefi, Mehdi; Safaei, Saied; Trippe, Elizabeth D.; Gutiérrez, Juan B.; Kochut, Krys J.

doi:10.48550/arxiv.1707.02919

preprintarXiv (Cornell University)Jul 10, 2017GREEN OA

A Brief Survey of Text Mining: Classification, Clustering and Extraction Techniques

MAMehdi Allahyari SPSeyedamin Pouriyeh MAMehdi Assefi SSSaied Safaei EDElizabeth D. Trippe

Indexed inarxivdatacite

Abstract

The amount of text that is generated every day is increasing dramatically. This tremendous volume of mostly unstructured text cannot be simply processed and perceived by computers. Therefore, efficient and effective techniques and algorithms are required to discover useful patterns. Text mining is the task of extracting meaningful information from text, which has gained significant attentions in recent years. In this paper, we describe several of the most fundamental text mining tasks and techniques including text pre-processing, classification and clustering. Additionally, we briefly explain text mining in biomedical and health care domains.

Citation impact

514

total citations

FWCI: —
Percentile: —
References: 123

Citations per year

Authors

7

Topics & keywords

Topics

Keywords

Computer science
Cluster analysis
Text mining
Biomedical text mining
Task (project management)
Information extraction
Information retrieval
Concept mining

UN Sustainable Development Goals

Quality Education

No related works found for this paper.