Learning to Predict Demand in a Transport-Resource Sharing Task
Naval Postgraduate School Monterey United States
Pagination or Media Count:
Resource allocation problems occur in many applications. One example is bike-sharing systems, which encourage the use of public transport by making it easy to rent and return bicycles for short transits. With large numbers of distributed kiosks recording the time and location of rental transactions, the system acts like a sensor network for movement of people throughout the city. In this thesis, we studied a range of machine-learning algorithms to predict demand ridership in a bike-sharing system, as part of an online competition. Predictions based on the Random Forest and Gradient Boosting algorithms produced results that ranked amongst the top 15 of more than 3,000 team submissions. We showed that the mandated use of logarithmic error as the evaluation metric overemphasizes errors made during off-peak hours. We systematically experimented with model refinements and feature engineering to improve predictions, with mixed results. Reduction in cross-validation errors did not always lead to a reduction in test set errors. This could be due to overfitting and the fact that the competition test set was not a random sample. The approach in this thesis could be generalized to predict use of other types of shared resources.