Learning to Solve Dynamic Vehicle Routing Problems (Research Project)
a joined research project from the Algorithms and Complexity Group, TU Wien, Austria, and Honda Research Institute, Germany
Vehicle routing - i.e., planning an optimal set of routes for a fleet of vehicles - is an intensely studied research area with enormous practical and raising relevance due to increasing mobility and transportation demand and new challenges coming, e.g., from increasing interest in shared mobility services and electric vehicles. The dial-a-ride problem (DARP), for example, is the problem of finding optimal tours of vehicles through different pickup and drop-off locations in order to serve a number of transportation requests, allowing different customers to share a vehicle. The electric autonomous dial-a-ride problem (E-ADARP) represents a challenging and practically relevant extension to the DARP, where electric autonomous vehicles are employed and their charging requirements have to be taken into consideration. Furthermore, not only classical objectives like total travel time have to be optimized, but user inconvenience also plays an important role. Thus, factors such as user excess ride time, which is due to detours because of the ride-sharing, have to be taken into account.
For such problems, heuristic optimization approaches are considered to be the means of choice due to a better scalability compared to exact approaches. In this project, we propose a heuristic framework based on large neighborhood search (LNS) to solve the E-ADARP, and we plan to tackle the issue of scalability by automatically designing, i.e., learning, efficient heuristics that either guide or possibly replace classical optimization techniques. We will investigate the usage of reinforcement learning with different machine learning models to dynamically select operators for the LNS from a set of possible operators during the optimization process. We intend to experimentally compare this approach to other learning techniques such as classical supervised learning, imitation learning, and Q learning.