File(s) under embargo
until file(s) become available
DYNAMIC BIKE SHARING REBALANCING: A HYBRID FRAMEWORK BASED ON DEEP REINFORCEMENT LEARNING AND MIXED INTEGER PROGRAMMING
Bike sharing systems (BSSs), as an emerging mobility mode, have been widely deployed in major cities around the world and viewed as an environment-friendly transportation mode to serve the first/last-mile trips. Because the bike usage demands are spatially and temporally unbalanced, bike stations may become empty or full at different times and in different regions, which will cause inconvenience to customers to rent or return bikes and result in potential customer loss. To establish a BSS that caters to the bike demands and decrease customer loss, the system operator needs to efficiently rebalance the system. Existing studies mainly developed the rebalancing policies based on the mixed integer programming (MIP) algorithms, which provide centralized solutions from the perspective of the entire system. However, as the real-world BSS is often large-scale and with dynamic demands, the rebalancing policy needs to be generated efficiently and be scalable for the real-time BSS operation. This study proposes a hybrid rebalancing policy to improve the efficiency of BSS. A model-free deep reinforcement learning (DRL) framework – DeepBike – is firstly proposed to learn the optimal rebalancing policy for station-based BSS, which makes distributed online decisions for each vehicle without coordinating with the others. Then this study adopts a Mixed Integer Programming algorithm and integrates it with DeepBike to develop the hybrid policy, leveraging the "online” solutions of the DeepBike and the “near-exact” solutions of the MIP. A large-scale real-world BSS simulator is built to compare the performance of different policies based on the historical trip records from Divvy (the BSS in Chicago). The results show that, while keeping the scalability and efficiency, the hybrid policy improve the profit by +29.9% and +39.8% compared to the DeepBike and the MIP respectively. This finding supports the real-world implementation of the DRL-MIP-based hybrid model to produce online rebalancing policy dynamically for real-time rebalancing operation of large-scale BSS with multiple vehicles.