Anti-Jerk On-Ramp Merging Using Deep Reinforcement Learning [article]

Yuan Lin, John McPhee, Nasser L. Azad
2020 arXiv   pre-print
Deep Reinforcement Learning (DRL) is used here for decentralized decision-making and longitudinal control for high-speed on-ramp merging. The DRL environment state includes the states of five vehicles: the merging vehicle, along with two preceding and two following vehicles when the merging vehicle is or is projected on the main road. The control action is the acceleration of the merging vehicle. Deep Deterministic Policy Gradient (DDPG) is the DRL algorithm for training to output continuous
more » ... trol actions. We investigated the relationship between collision avoidance for safety and jerk minimization for passenger comfort in the multi-objective reward function by obtaining the Pareto front. We found that, with a small jerk penalty in the multi-objective reward function, the vehicle jerk could be reduced by 73% compared with no jerk penalty while the collision rate was maintained at zero. Regardless of the jerk penalty, the merging vehicle exhibited decision-making strategies such as merging ahead or behind a main-road vehicle.
arXiv:1909.12967v3 fatcat:ldzb5si5pjgdta4fmcla5vn27m