eCite Digital Repository

Enhancing transferability of deep reinforcement learning-based variable speed limit control using transfer learning


Ke, Z and Li, Z and Cao, Z and Liu, P, Enhancing transferability of deep reinforcement learning-based variable speed limit control using transfer learning, IEEE Transactions on Intelligent Transportation Systems pp. 1-12. ISSN 1524-9050 (2020) [Refereed Article]

PDF (Accepted Version)

Copyright Statement

Copyright 2020 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.

DOI: doi:10.1109/TITS.2020.2990598


The study aims to evaluate the performance of the transfer learning algorithm to enhance the transferability of a deep reinforcement learning-based variable speed limits (VSL) control. The Double Deep Q Network (DDQN)-based VSL control strategy is proposed for reducing total time spent (TTS) on freeways. A real merging bottleneck is developed in the simulation and considered for the VSL control as the source scenario. Three types of target scenarios are considered, including the overspeed scenarios, adverse weather scenarios, and diverse capacity drop scenarios. A stable testing demand and a fluctuating testing demand are adopted to evaluate the effects of VSL control. The results show that by updating the neural networks, the transfer learning in the DDQN-based VSL control agent successfully transfers knowledge learned in the source scenario to other target scenarios. With the transfer learning, the entire training process is shortened by 32.3% to 69.8%, while keeping a similar maximum reward level, as compared to the VSL control with full learning from scratch. With the transferred DDQN-based VSL strategy, the TTS is reduced by 26.02% to 67.37% with the stable testing demand and 21.31% to 69.98% with the fluctuating testing demand in various scenarios, respectively. The results also show that when the task similarity between the source scenario and target scenario is relatively low, the transfer learning could lead to local optimum and may not achieve the global optimal control effects.

Item Details

Item Type:Refereed Article
Keywords:bottleneck, congestion, reinforcement learning, travel time, transferability, control , transfer learning
Research Division:Information and Computing Sciences
Research Group:Computer vision and multimedia computation
Research Field:Pattern recognition
Objective Division:Defence
Objective Group:Defence
Objective Field:Intelligence, surveillance and space
UTAS Author:Cao, Z (Dr Zehong Cao)
ID Code:138997
Year Published:2020
Web of Science® Times Cited:8
Deposited By:Information and Communication Technology
Deposited On:2020-05-18
Last Modified:2020-08-24
Downloads:19 View Download Statistics

Repository Staff Only: item control page