Full Text:   <2232>

CLC number: TP393.4; R51

On-line Access: 2010-03-22

Received: 2009-06-25

Revision Accepted: 2009-09-29

Crosschecked: 2010-01-29

Cited: 11

Clicked: 5944

Citations:  Bibtex RefMan EndNote GB/T7714

-   Go to

Article info.
Open peer comments

Journal of Zhejiang University SCIENCE C 2010 Vol.11 No.4 P.241-248


Notifiable infectious disease surveillance with data collected by search engine

Author(s):  Xi-chuan Zhou, Hai-bin Shen

Affiliation(s):  Department of Information Science and Electronic Engineering, Zhejiang University, Hangzhou 310027, China, School of Electrical Engineering, Zhejiang University, Hangzhou 310027, China

Corresponding email(s):   shenhb@yahoo.cn

Key Words:  Notifiable infectious diseases, Disease surveillance, Search engine

Xi-chuan Zhou, Hai-bin Shen. Notifiable infectious disease surveillance with data collected by search engine[J]. Journal of Zhejiang University Science C, 2010, 11(4): 241-248.

@article{title="Notifiable infectious disease surveillance with data collected by search engine",
author="Xi-chuan Zhou, Hai-bin Shen",
journal="Journal of Zhejiang University Science C",
publisher="Zhejiang University Press & Springer",

%0 Journal Article
%T Notifiable infectious disease surveillance with data collected by search engine
%A Xi-chuan Zhou
%A Hai-bin Shen
%J Journal of Zhejiang University SCIENCE C
%V 11
%N 4
%P 241-248
%@ 1869-1951
%D 2010
%I Zhejiang University Press & Springer
%DOI 10.1631/jzus.C0910371

T1 - Notifiable infectious disease surveillance with data collected by search engine
A1 - Xi-chuan Zhou
A1 - Hai-bin Shen
J0 - Journal of Zhejiang University Science C
VL - 11
IS - 4
SP - 241
EP - 248
%@ 1869-1951
Y1 - 2010
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/jzus.C0910371

notifiable infectious diseases are a major public health concern in China, causing about five million illnesses and twelve thousand deaths every year. Early detection of disease activity, when followed by a rapid response, can reduce both social and medical impact of the disease. We aim to improve early detection by monitoring health-seeking behavior and disease-related news over the Internet. Specifically, we counted unique search queries submitted to the Baidu search engine in 2008 that contained disease-related search terms. Meanwhile we counted the news articles aggregated by Baidu’s robot programs that contained disease-related keywords. We found that the search frequency data and the news count data both have distinct temporal association with disease activity. We adopted a linear model and used searches and news with 1–200-day lead time as explanatory variables to predict the number of infections and deaths attributable to four notifiable infectious diseases, i.e., scarlet fever, dysentery, AIDS, and tuberculosis. With the search frequency data and news count data, our approach can quantitatively estimate up-to-date epidemic trends 10–40 days ahead of the release of Chinese Centers for Disease Control and Prevention (Chinese CDC) reports. This approach may provide an additional tool for notifiable infectious disease surveillance.

Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article


[1]Brownstein, J.S., Freifeld, C.C., Reis, B.Y., Mandl, K.D., 2008. Surveillance sans frontiers: Internet-based emerging infectious disease intelligence and the Healthmap project. PLoS Med., 5(7):e151.

[2]Bundorf, M.K., Wagner, T.H., Singer, S.J., Baker, L.C., 2006. Who searches the Internet for health information? Health Serv. Res., 41:819-836.

[3]Cooper, C.P., Mallon, K.P., Leadbetter, S., Pollack, L.A., Peipins, L.A., 2005. Cancer Internet search activity on a major search engine, United States 2001-2003. J. Med. Internet Res., 7(3):e36.

[4]Diaz, J.A., Griffith, R.A., Ng, J.J., Reinert, S.E., Friedmann, P.D., Moulton, A.W., 2002. Patients’ use of the Internet for medical information. J. Gener. Intern. Med., 17(3):180-185.

[5]Ettredge, M., Gerdes, J., Karuga, G., 2005. Using Web-based search data to predict macroeconomic statistics. Commun. ACM, 48:87-92.

[6]Fox, S., 2006. Pew Internet and American Life Project. Online Health Search. Available from http://www.pewinternet.org/PPF/r/190/reportdisplay.asp [Accessed on Apr. 25, 2008].

[7]Ginsberg, J., Mohebbi, M.H., Patel, R.S., Brammer, L., Smolinski, M.S., Brilliant, L., 2009. Detecting influenza epidemics using search engine query data. Nature, 457(7232):1012-1014.

[8]Johnson, H.A., Wagner, M.M., Hogan, W.R., Chapman, W., Olszewski, R.T., Dowling, J., Barnas, G., 2004. Analysis of Web access logs for surveillance of influenza. Stud. Health Technol. Inform., 107:1202-1208.

[9]Polgreen, P.M., Chen, Y., Pennock, D.M., Nelson, D., 2008. Using Internet searches for influenza surveillance. Clin. Infect. Dis., 47(11):1443-1448.

[10]Wilson, K., Brownstein, J.S., 2009. Early detection of disease outbreaks using the Internet. CMAJ, 180(8).

[11]Ybarra, M.L., Suman, M., 2006. Help seeking behavior and the Internet: a national survey. Int. J. Med. Inform., 75(1):29-41.

Open peer comments: Debate/Discuss/Question/Opinion


Please provide your name, email address and a comment

Journal of Zhejiang University-SCIENCE, 38 Zheda Road, Hangzhou 310027, China
Tel: +86-571-87952783; E-mail: cjzhang@zju.edu.cn
Copyright © 2000 - 2022 Journal of Zhejiang University-SCIENCE