|
Journal of Zhejiang University SCIENCE A
ISSN 1673-565X(Print), 1862-1775(Online), Monthly
2005 Vol.6 No.1 P.49-55
An improved TF-IDF approach for text classification
Abstract: This paper presents a new improved term frequency/inverse document frequency (TF-IDF) approach which uses confidence, support and characteristic words to enhance the recall and precision of text classification. Synonyms defined by a lexicon are processed in the improved TF-IDF approach. We detailedly discuss and analyze the relationship among confidence, recall and precision. The experiments based on science and technology gave promising results that the new TF-IDF approach improves the precision and recall of text classification compared with the conventional TF-IDF approach.
Key words: Term frequency/inverse document frequency (TF-IDF), Text classification, Confidence, Support, Characteristic words
References:
Open peer comments: Debate/Discuss/Question/Opinion
<1>
Linlin Gao@Harbin Engineering University<gll\_89@163.com>
2013-09-26 14:40:47
Look forword to reading the full paper!
DOI:
10.1631/jzus.2005.A0049
CLC number:
TP31
Download Full Text:
Downloaded:
5491
Clicked:
8566
Cited:
0
On-line Access:
2024-08-27
Received:
2023-10-17
Revision Accepted:
2024-05-08
Crosschecked: