Full Text:   <2223>

CLC number: U491; TP181

On-line Access: 2022-12-14

Received: 2022-07-28

Revision Accepted: 2022-12-17

Crosschecked: 2022-10-06

Cited: 0

Clicked: 1176

Citations:  Bibtex RefMan EndNote GB/T7714


Fei-Yue WANG


Xingyuan DAI


-   Go to

Article info.
Open peer comments

Frontiers of Information Technology & Electronic Engineering  2022 Vol.23 No.12 P.1795-1813


Image-based traffic signal control via world models

Author(s):  Xingyuan DAI, Chen ZHAO, Xiao WANG, Yisheng LV, Yilun LIN, Fei-Yue WANG

Affiliation(s):  The State Key Laboratory for Management and Control of Complex Systems, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China; more

Corresponding email(s):   feiyue.wang@ia.ac.cn

Key Words:  Traffic signal control, Traffic prediction, Traffic world model, Reinforcement learning

Xingyuan DAI, Chen ZHAO, Xiao WANG, Yisheng LV, Yilun LIN, Fei-Yue WANG. Image-based traffic signal control via world models[J]. Frontiers of Information Technology & Electronic Engineering, 2022, 23(12): 1795-1813.

@article{title="Image-based traffic signal control via world models",
author="Xingyuan DAI, Chen ZHAO, Xiao WANG, Yisheng LV, Yilun LIN, Fei-Yue WANG",
journal="Frontiers of Information Technology & Electronic Engineering",
publisher="Zhejiang University Press & Springer",

%0 Journal Article
%T Image-based traffic signal control via world models
%A Xingyuan DAI
%A Chen ZHAO
%A Xiao WANG
%A Yisheng LV
%A Yilun LIN
%A Fei-Yue WANG
%J Frontiers of Information Technology & Electronic Engineering
%V 23
%N 12
%P 1795-1813
%@ 2095-9184
%D 2022
%I Zhejiang University Press & Springer
%DOI 10.1631/FITEE.2200323

T1 - Image-based traffic signal control via world models
A1 - Xingyuan DAI
A1 - Chen ZHAO
A1 - Xiao WANG
A1 - Yisheng LV
A1 - Yilun LIN
A1 - Fei-Yue WANG
J0 - Frontiers of Information Technology & Electronic Engineering
VL - 23
IS - 12
SP - 1795
EP - 1813
%@ 2095-9184
Y1 - 2022
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/FITEE.2200323

traffic signal control is shifting from passive control to proactive control, which enables the controller to direct current traffic flow to reach its expected destinations. To this end, an effective prediction model is needed for signal controllers. What to predict, how to predict, and how to leverage the prediction for control policy optimization are critical problems for proactive traffic signal control. In this paper, we use an image that contains vehicle positions to describe intersection traffic states. Then, inspired by a model-based reinforcement learning method, DreamerV2, we introduce a novel learning-based traffic world model. The traffic world model that describes traffic dynamics in image form is used as an abstract alternative to the traffic environment to generate multi-step planning data for control policy optimization. In the execution phase, the optimized traffic controller directly outputs actions in real time based on abstract representations of traffic states, and the world model can also predict the impact of different control behaviors on future traffic conditions. Experimental results indicate that the traffic world model enables the optimized real-time control policy to outperform common baselines, and the model achieves accurate image-based prediction, showing promising applications in futuristic traffic signal control.




Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article


[1]Abdoos M, Bazzan ALC, 2021. Hierarchical traffic signal optimization using reinforcement learning and traffic prediction with long-short term memory. Expert Syst Appl, 171:114580.

[2]Bertsekas D, 2021. Multiagent reinforcement learning: rollout and policy iteration. IEEE/CAA J Autom Sin, 8(2):249-272.

[3]Dai XY, Fu R, Zhao EM, et al., 2019. DeepTrend 2.0: a light-weighted multi-scale traffic prediction model using detrending. Transp Res Part C Emerg Technol, 103:142-157.

[4]Guo QQ, Li L, Ban XG, 2019. Urban traffic signal control with connected and automated vehicles: a survey. Transp Res Part C Emerg Technol, 101:313-334.

[5]Hafner D, Lillicrap T, Fischer I, et al., 2019. Learning latent dynamics for planning from pixels. Proc 36th Int Conf on Machine Learning, p.2555-2565.

[6]Hafner D, Lillicrap TP, Norouzi M, et al., 2022. Mastering Atari with discrete world models. https://arxiv.org/abs/2010.02193

[7]Hao ZZ, Boel R, Li ZW, 2018. Model based urban traffic control, part I: local model and local model predictive controllers. Transp Res Part C Emerg Technol, 97:61-81.

[8]Jin JC, Guo HF, Xu J, et al., 2021. An end-to-end recommendation system for urban traffic controls and management under a parallel learning framework. IEEE Trans Intell Transp Syst, 22(3):1616-1626.

[9]Kim D, Jeong O, 2019. Cooperative traffic signal control with traffic flow prediction in multi-intersection. Sensors, 20(1):137.

[10]Li L, Lv YS, Wang FY, 2016. Traffic signal timing via deep reinforcement learning. IEEE/CAA J Autom Sin, 3(3):247-254.

[11]Li L, Lin YL, Zheng NN, et al., 2017. Parallel learning: a perspective and a framework. IEEE/CAA J Autom Sin, 4(3):389-395.

[12]Li ZS, Xiong G, Tian YL, et al., 2022. A multi-stream feature fusion approach for traffic prediction. IEEE Trans Intell Transp Syst, 23(2):1456-1466.

[13]Liang XY, Du XS, Wang GL, et al., 2019. A deep reinforcement learning network for traffic light cycle control. IEEE Trans Veh Technol, 68(2):1243-1253.

[14]Liu CH, Zhu F, Liu Q, et al., 2021. Hierarchical reinforcement learning with automatic sub-goal identification. IEEE/CAA J Autom Sin, 8(10):1686-1696.

[15]Lopez PA, Behrisch M, Bieker-Walz L, et al., 2018. Microscopic traffic simulation using SUMO. Proc 21st IEEE Int Conf on Intelligent Transportation Systems, p.2575-2582.

[16]Lv YS, Duan YJ, Kang WW, et al., 2014. Traffic flow prediction with big data: a deep learning approach. IEEE Trans Intell Transp Syst, 16(2):865-873.

[17]Mao F, Li ZH, Li L, 2022. A comparison of deep reinforcement learning models for isolated traffic signal control. IEEE Intell Transp Syst Mag, early access.

[18]Mei ZY, Tan Z, Zhang W, et al., 2019. Simulation analysis of traffic signal control and transit signal priority strategies under arterial coordination conditions. Simulation, 95(1):51-64.

[19]Mnih V, Kavukcuoglu K, Silver D, et al., 2015. Human-level control through deep reinforcement learning. Nature, 518(7540):529-533.

[20]Newell GF, 1969. Properties of vehicle-actuated signals: I. one-way streets. Transp Sci, 3(1):30-52.

[21]Nie J, Yan J, Yin HL, et al., 2021. A multimodality fusion deep neural network and safety test strategy for intelligent vehicles. IEEE Trans Intell Veh, 6(2):310-322.

[22]Seng D, Lv FS, Liang ZY, et al., 2021. Forecasting traffic flows in irregular regions with multi-graph convolutional network and gated recurrent unit. Front Inform Technol Electron Eng, 22(9):1179-1193.

[23]Sutton RS, Barto AG, 2018. Reinforcement Learning: an Introduction (2nd Ed.). The MIT Press, Cambridge, USA.

[24]Varaiya P, 2013. Max pressure control of a network of signalized intersections. Transp Res Part C Emerg Technol, 36:177-195.

[25]Wang FY, 2010. Parallel control and management for intelligent transportation systems: concepts, architectures, and applications. IEEE Trans Intell Transp Syst, 11(3):630-638.

[26]Wang HN, Liu N, Zhang YY, et al., 2020. Deep reinforcement learning: a survey. Front Inform Technol Electron Eng, 21(12):1726-1744.

[27]Wang J, Li R, Wang J, et al., 2020. Artificial intelligence and wireless communications. Front Inform Technol Electron Eng, 21(10):1413-1425.

[28]Webster FV, 1958. Traffic Signal Settings. Technical Report No. 39, Road Research Laboratory, UK.

[29]Wei H, Xu N, Zhang HC, et al., 2019a. CoLight: learning network-level cooperation for traffic signal control. Proc 28th ACM Int Conf on Information and Knowledge Management, p.1913-1922.

[30]Wei H, Chen CC, Zheng GJ, et al., 2019b. PressLight: learning max pressure control to coordinate traffic signals in arterial network. Proc 25th ACM SIGKDD Int Conf on Knowledge Discovery & Data Mining, p.1290-1298.

[31]Wiering M, 2000. Multi-agent reinforcement learning for traffic light control. Proc 17th Int Conf on Machine Learning, p.1151-1158.

[32]Xiao Y, Codevilla F, Gurram A, et al., 2022. Multimodal end-to-end autonomous driving. IEEE Trans Intell Transp Syst, 23(1):537-547.

[33]Xiong G, Dong XS, Lu H, et al., 2020. Research progress of parallel control and management. IEEE/CAA J Autom Sin, 7(2):355-367.

[34]Ye BL, Wu WM, Ruan KY, et al., 2019. A survey of model predictive control methods for traffic signal control. IEEE/CAA J Autom Sin, 6(3):623-640.

[35]Yu ZX, Liang SX, Wei L, et al., 2020. MaCAR: urban traffic light control via active multi-agent communication and action rectification. Proc 29th Int Joint Conf on Artificial Intelligence, p.2491-2497.

[36]Zhang HC, Kafouros M, Yu Y, 2020. PlanLight: learning to optimize traffic signal control with planning and iterative policy improvement. IEEE Access, 8:219244-219255.

[37]Zhang KQ, Yang ZR, Basar T, 2021. Decentralized multi-agent reinforcement learning with networked agents: recent advances. Front Inform Technol Electron Eng, 22(6):802-814.

[38]Zhao YF, Gao H, Wang S, et al., 2017. A novel approach for traffic signal control: a recommendation perspective. IEEE Intell Transp Syst Mag, 9(3):127-135.

[39]Zhu FH, Lv YS, Chen YY, et al., 2020. Parallel transportation systems: toward IoT-enabled smart urban traffic control and management. IEEE Trans Intell Transp Syst, 21(10):4063-4071.

Open peer comments: Debate/Discuss/Question/Opinion


Please provide your name, email address and a comment

Journal of Zhejiang University-SCIENCE, 38 Zheda Road, Hangzhou 310027, China
Tel: +86-571-87952783; E-mail: cjzhang@zju.edu.cn
Copyright © 2000 - 2024 Journal of Zhejiang University-SCIENCE