articleAug 21, 2003Closed access
Tackling the poor assumptions of naive bayes text classifiers
Massachusetts Institute of Technology
Abstract
Naive Bayes is often used as a baseline in text classication because it is fast and easy to implement. Its severe assumptions make such eciency possible but also adversely af-fect the quality of its results. In this paper we propose simple, heuristic solutions to some of the problems with Naive Bayes classiers, ad-dressing both systemic issues as well as prob-lems that arise because text is not actually generated according to a multinomial model. We nd that our simple corrections result in a fast algorithm that is competitive with state-of-the-art text classication algorithms such as the Support Vector Machine. 1.
Citation impact
952
total citations
- FWCI
- 27.62
- Percentile
- 100%
- References
- 15
Citations per year
Authors
4Topics & keywords
Topics
Keywords
- Naive Bayes classifier
- Computer science
- Machine learning
- Support vector machine
- Bayes' theorem
- Artificial intelligence
- Simple (philosophy)
- Heuristic
UN Sustainable Development Goals
- No poverty
No related works found for this paper.