Publishing Service

Polishing & Checking

Frontiers of Information Technology & Electronic Engineering

ISSN 2095-9184 (print), ISSN 2095-9230 (online)

Home location inference from sparse and noisy data: models and applications

Abstract: Accurate home location is increasingly important for urban computing. Existing methods either rely on continuous (and expensive) Global Positioning System (GPS) data or suffer from poor accuracy. In particular, the sparse and noisy nature of social media data poses serious challenges in pinpointing where people live at scale. We revisit this research topic and infer home location within 100 m×100 m squares at 70% accuracy for 76% and 71% of active users in New York City and the Bay Area, respectively. To the best of our knowledge, this is the first time home location has been detected at such a fine granularity using sparse and noisy data. Since people spend a large portion of their time at home, our model enables novel applications. As an example, we focus on modeling people’s health at scale by linking their home locations with publicly available statistics, such as education disparity. Results in multiple geographic regions demonstrate both the effectiveness and added value of our home localization method and reveal insights that eluded earlier studies. In addition, we are able to discover the real buzz in the communities where people live.

Key words: Home location, Mobility patterns, Healthcare

Chinese Summary  <30> 基于稀疏噪声数据的家的位置推断:模型与应用

目的:家,是人们生活的中心。由于家的特殊意义,在对于人类活动的研究中,确定家的位置就显得尤为重要。本文旨在从一个人的签到记录上准确预测家的具体位置(精度在100米以内)。
创新点:由于家的位置属于隐私,我们无法,也不能直接使用用户的隐私数据来进行研究。因此数据的采集和近似是第一个难题。本文的解决方法是认为人们在家里说的话跟在外面说的话不一样。由于人们在家里签到会说一些特点的词汇,比如“睡觉”、“洗澡”,等等。我们收集了带有这样词汇的签到,然后把这样的签到句子经由多人筛选。如果所有人都认为某一条签到是来自家里的,我们就认为这个签到的位置是发送者的家的位置。
方法:从人们的签到中抽取一些关键的特征,再把这些特征经由数据挖掘的算法提炼得出一个综合的判断。我们考虑的特征包括,人们出现在某地点的频率、时间,以及是否在夜间出现等等。
结论:实验证明,可以以70%+的准确率预测70%+的活跃社交网络用户,而且精度是100米以内。

关键词组:家的位置;移动模式;医疗保健


Share this article to: More

Go to Contents

References:

<Show All>

Open peer comments: Debate/Discuss/Question/Opinion

<1>

Please provide your name, email address and a comment





DOI:

10.1631/FITEE.1500385

CLC number:

TP391

Download Full Text:

Click Here

Downloaded:

2123

Download summary:

<Click Here> 

Downloaded:

1514

Clicked:

5622

Cited:

1

On-line Access:

2016-05-04

Received:

2015-11-07

Revision Accepted:

2016-02-19

Crosschecked:

2016-04-11

Journal of Zhejiang University-SCIENCE, 38 Zheda Road, Hangzhou 310027, China
Tel: +86-571-87952276; Fax: +86-571-87952331; E-mail: jzus@zju.edu.cn
Copyright © 2000~ Journal of Zhejiang University-SCIENCE