Full Text:   <5855>

CLC number: U121; TP391

On-line Access: 2012-10-01

Received: 2012-02-23

Revision Accepted: 2012-06-11

Crosschecked: 2012-08-20

Cited: 22

Clicked: 9482

Citations:  Bibtex RefMan EndNote GB/T7714

-   Go to

Article info.
Open peer comments

Journal of Zhejiang University SCIENCE C 2012 Vol.13 No.10 P.750-760


Transit smart card data mining for passenger origin information extraction

Author(s):  Xiao-lei Ma, Yin-hai Wang, Feng Chen, Jian-feng Liu

Affiliation(s):  Department of Civil and Environmental Engineering, University of Washington, Seattle, WA 98195-2700, USA; more

Corresponding email(s):   xiaolm@uw.edu, yinhai@uw.edu

Key Words:  Transit smart card, Automated fare collection (AFC), Bayesian decision tree, Markov chain, Origin inference

Xiao-lei Ma, Yin-hai Wang, Feng Chen, Jian-feng Liu. Transit smart card data mining for passenger origin information extraction[J]. Journal of Zhejiang University Science C, 2012, 13(10): 750-760.

@article{title="Transit smart card data mining for passenger origin information extraction",
author="Xiao-lei Ma, Yin-hai Wang, Feng Chen, Jian-feng Liu",
journal="Journal of Zhejiang University Science C",
publisher="Zhejiang University Press & Springer",

%0 Journal Article
%T Transit smart card data mining for passenger origin information extraction
%A Xiao-lei Ma
%A Yin-hai Wang
%A Feng Chen
%A Jian-feng Liu
%J Journal of Zhejiang University SCIENCE C
%V 13
%N 10
%P 750-760
%@ 1869-1951
%D 2012
%I Zhejiang University Press & Springer
%DOI 10.1631/jzus.C12a0049

T1 - Transit smart card data mining for passenger origin information extraction
A1 - Xiao-lei Ma
A1 - Yin-hai Wang
A1 - Feng Chen
A1 - Jian-feng Liu
J0 - Journal of Zhejiang University Science C
VL - 13
IS - 10
SP - 750
EP - 760
%@ 1869-1951
Y1 - 2012
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/jzus.C12a0049

The automated fare collection (AFC) system, also known as the transit smart card (SC) system, has gained more and more popularity among transit agencies worldwide. Compared with the conventional manual fare collection system, an AFC system has its inherent advantages in low labor cost and high efficiency for fare collection and transaction data archival. Although it is possible to collect highly valuable data from transit SC transactions, substantial efforts and methodologies are needed for extracting such data because most AFC systems are not initially designed for data collection. This is true especially for the Beijing AFC system, where a passenger’s boarding stop (origin) on a flat-rate bus is not recorded on the check-in scan. To extract passengers’ origin data from recorded SC transaction information, a markov chain based bayesian decision tree algorithm is developed in this study. Using the time invariance property of the markov chain, the algorithm is further optimized and simplified to have a linear computational complexity. This algorithm is verified with transit vehicles equipped with global positioning system (GPS) data loggers. Our verification results demonstrated that the proposed algorithm is effective in extracting transit passengers’ origin information from SC transactions with a relatively high accuracy. Such transit origin data are highly valuable for transit system planning and route optimization.

Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article


[1]Barry, J.J., Newhouser, R., Rahbee, A., Sayeda, S., 2002. Origin and destination estimation in New York City with automated fare system data. Transp. Res. Rec., 1817:183-187.

[2]Barry, J.J., Freimer, R., Slavin, H., 2009. Use of entry-only automatic fare collection data to estimate linked transit trips in New York City. Transp. Res. Rec., 2112:53-61.

[3]Bayes, T., Price, R., 1763. An essay towards solving a problem in the doctrine of chances. Phil. Trans. R. Soc. Lond., 53:370-418.

[4]BTRC (Beijing Transportation Research Center), 2010a. Beijing Transport Annual Report 2010. Available from http://www.bjtrc.org.cn/InfoCenter%5CNewsAttach%5C%5C3891f531-3019-4d28-9b70-29c58217b50d.pdf (in Chinese) [Accessed on Aug. 23, 2011].

[5]BTRC (Beijing Transportation Research Center), 2010b. Beijing Transportation Smart Card Usage Survey. Research Report, unpublished (in Chinese).

[6]Chu, K.K.A., Chapleau, R., 2008. Enriching archived smart card transaction data for transit demand modeling. Transp. Res. Rec., 2063:63-72.

[7]Cooper, G.F., 1990. The computational complexity of probabilistic inference using Bayesian belief networks. Artif. Intell., 42(2-3):393-405.

[8]Farzin, J.M., 2008. Constructing an automated bus origin-destination matrix using farecard and global positioning system data in Sao Paulo, Brazil. Transp. Res. Rec., 2072:30-37.

[9]Hofmann, M., Wilson, S., White, P., 2009. Automated Identification of Linked Trips at Trip Level Using Electronic Fare Collection Data. 88th Annual Meeting of Transportation Research Board, p.18.

[10]Jang, W., 2010. Travel time and transfer analysis using transit smart card data. Transp. Res. Rec., 2144:142-149.

[11]Janssens, D., Wets, W., Brijs, T., Vanhoof, K., Arentze, T., Timmermans, H., 2006. Integrating Bayesian networks and decision trees in a sequential rule-based transportation model. Eur. J. Oper. Res., 175(1):16-34.

[12]Li, B., 2009. Markov models for Bayesian analysis about transit route origin-destination matrices. Transp. Res. Part B, 43(3):301-310.

[13]Nassir, N., Khani, A., Lee, S.G., Noh, H., Hickman, M., 2011. Transit stop-level origin-destination estimation through use of transit schedule and automated data collection system. Transp. Res. Rec., 2263:140-150.

[14]Pelletier, M.P., Trépanier, M., Morency, C., 2011. Smart card data use in public transit. Transp. Res. Part C, 19(4):557-568.

[15]Rahbee, A.B., 2009. Farecard passenger flow model at Chicago transit authority, Illinois. Transp. Res. Rec., 2072:3-9.

[16]Reddy, A., Lu, A., Kumar, S., Bashmakov, V., Rudenko, S., 2009. Entry-only automated fare collection (AFC) system data used to infer ridership, rider destinations, unlinked trips, and passenger miles. Transp. Res. Rec., 2110:128-136.

[17]Trépanier, M., Tranchant, N., Chapleau, R., 2007. Individual trip destination estimation in a transit smart card automated fare collection system. J. Intell. Transp. Syst., 11(1):1-14.

[18]Trépanier, M., Morency, C., Agard, B., 2009. Calculation of transit performance measures using smartcard data. J. Publ. Transp., 12(1):79-96.

[19]US Energy Information Administration, 2007. International Energy Outlook 2007. Available from http://www.eia.gov/forecasts/archive/ieo07/index.html [Accessed on Feb. 23, 2010].

[20]Zhang, L., Zhao, S., Zhu, Y., Zhu, Z., 2007. Study on the Method of Constructing Bus Stops OD Matrix Based on IC Card Data. Int. Conf. on Wireless Communications, Networking and Mobile Computing, p.3147-3150.

[21]Zhang, Y.F., 2002. Programming on OD Matrix Estimation— Application in New York City Mass Transit System. Proc. 3rd Int. Conf. on Traffic and Transportation Studies, p.786-792.

[22]Zhao, J., Rahbee, A., Wilson, N.H.M., 2007. Estimating a rail passenger trip origin-destination matrix using automatic data collection systems. Comput.-Aided Civ. Infr. Eng., 22(5):376-387.

Open peer comments: Debate/Discuss/Question/Opinion


Please provide your name, email address and a comment

Journal of Zhejiang University-SCIENCE, 38 Zheda Road, Hangzhou 310027, China
Tel: +86-571-87952783; E-mail: cjzhang@zju.edu.cn
Copyright © 2000 - 2024 Journal of Zhejiang University-SCIENCE