Journal of Zhejiang University SCIENCE A 2005 Vol.6 No.1 P.49-55

http://doi.org/10.1631/jzus.2005.A0049


An improved TF-IDF approach for text classification*


Author(s):  Yun-tao Zhang1,2,Ling Gong2,Yong-cheng Wang2

Affiliation(s):  1. 1. Network & Information Center, Shanghai Jiaotong University, Shanghai 200030, China more

Corresponding email(s):   ytzhang@mail.sjtu.edu.cn

Key Words:  Term frequency/inverse document frequency (TF-IDF), Text classification, Confidence, Support, Characteristic words


ZHANG Yun-tao, GONG Ling, WANG Yong-cheng. An improved TF-IDF approach for text classification[J]. Journal of Zhejiang University Science A, 2005, 6(1): 49-55.

@article{title="An improved TF-IDF approach for text classification",
author="ZHANG Yun-tao, GONG Ling, WANG Yong-cheng",
journal="Journal of Zhejiang University Science A",
volume="6",
number="1",
pages="49-55",
year="2005",
publisher="Zhejiang University Press & Springer",
doi="10.1631/jzus.2005.A0049"
}

%0 Journal Article
%T An improved TF-IDF approach for text classification
%A ZHANG Yun-tao
%A GONG Ling
%A WANG Yong-cheng
%J Journal of Zhejiang University SCIENCE A
%V 6
%N 1
%P 49-55
%@ 1673-565X
%D 2005
%I Zhejiang University Press & Springer
%DOI 10.1631/jzus.2005.A0049

TY - JOUR
T1 - An improved TF-IDF approach for text classification
A1 - ZHANG Yun-tao
A1 - GONG Ling
A1 - WANG Yong-cheng
J0 - Journal of Zhejiang University Science A
VL - 6
IS - 1
SP - 49
EP - 55
%@ 1673-565X
Y1 - 2005
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/jzus.2005.A0049


Abstract: 
This paper presents a new improved term frequency/inverse document frequency (TF-IDF) approach which uses confidence, support and characteristic words to enhance the recall and precision of text classification. Synonyms defined by a lexicon are processed in the improved TF-IDF approach. We detailedly discuss and analyze the relationship among confidence, recall and precision. The experiments based on science and technology gave promising results that the new TF-IDF approach improves the precision and recall of text classification compared with the conventional TF-IDF approach.

Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article

Open peer comments: Debate/Discuss/Question/Opinion

<1>

roger<rogerchunh@systex.com.tw>

2014-09-05 01:00:49

good ,thanks for sharing

saba@rashid<sabafaraz2013@hotmail.com>

2014-03-16 19:30:24

want to read this paper

Linlin Gao@Harbin Engineering University<gll\_89@163.com>

2013-09-26 14:40:47

Look forword to reading the full paper!

Please provide your name, email address and a comment





Full Text:   <6517>

CLC number: TP31

On-line Access: 2024-08-27

Received: 2023-10-17

Revision Accepted: 2024-05-08

Crosschecked: 0000-00-00

Cited: 0

Clicked: 10609

Citations:  Bibtex RefMan EndNote GB/T7714

Journal of Zhejiang University-SCIENCE, 38 Zheda Road, Hangzhou 310027, China
Tel: +86-571-87952783; E-mail: cjzhang@zju.edu.cn
Copyright © 2000 - 2026 Journal of Zhejiang University-SCIENCE