Journal of Zhejiang University SCIENCE A 2004 Vol.5 No.9 P.1106-1113


Improving the precision of the keyword-matching pornographic text filtering method using a hybrid model

Author(s):  SU Gui-yang, LI Jian-hua, MA Ying-hua, LI Sheng-hong

Affiliation(s):  Department of Electronic Engineering, Shanghai Jiaotong University, Shanghai 200030, China

Corresponding email(s):   sugy@sjtu.edu.cn

Key Words:  Pornographic text filtering, Content based filtering, Information filtering, Network content security

SU Gui-yang, LI Jian-hua, MA Ying-hua, LI Sheng-hong. Improving the precision of the keyword-matching pornographic text filtering method using a hybrid model[J]. Journal of Zhejiang University Science A, 2004, 5(9): 1106-1113.

With the flooding of pornographic information on the Internet, how to keep people away from that offensive information is becoming one of the most important research areas in network information security. Some applications which can block or filter such information are used. Approaches in those systems can be roughly classified into two kinds: metadata based and content based. With the development of distributed technologies, content based filtering technologies will play a more and more important role in filtering systems. Keyword matching is a content based method used widely in harmful text filtering. Experiments to evaluate the recall and precision of the method showed that the precision of the method is not satisfactory, though the recall of the method is rather high. According to the results, a new pornographic text filtering model based on reconfirming is put forward. Experiments showed that the model is practical, has less loss of recall than the single keyword matching method, and has higher precision.

Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article


