Taxi Service Trajectory (TST) Prediction Challenge 2015


Electronic taxi dispatch systems are in wide use today. These systems have replaced the traditional VHF-radio dispatch by installing mobile data terminals in taxis, which typically provide GPS localization information and taximeter state. In the last couple of years, the broadcast-based radio messages for service dispatching were replaced by unicast-based messages between the taxi central and the selected vehicle.

In most cases, taxi drivers operating through an electronic dispatch system do not indicate the final destination of the ride. In some cases, particularly when the demand for taxis is higher than the taxi availability, the closest taxi to a particular location is exactly the taxi that will end its current ride at that location. While in broadcast-based radio dispatching this was not a problem, in unicast-based electronic dispatching it becomes a problem, given that most drivers do not indicate the final destination of their current ride. To improve the efficiency of electronic taxi dispatching systems it becomes important to be able to predict the final destination of busy taxis. The spatial trajectory of a busy taxi could provide some hints on where it is going. Similarly, given the taxi id, it might be possible to guess its final destination based on the regularity of pre-hired services. In a significant number of taxi rides (approximately 25%), the taxi has been called through the taxi call-center, and thus the passenger's telephone id can be used to narrow the destination prediction based on the historical ride data of such telephone id.

In this challenge, we propose you to build a predictive framework able to infer the final destination of each taxi ride based on their (initial) partial trajectories. This challenge is divisible in two different outputs: (a) the destination coordinates (WGS84) and (b) the total trip's travel time (counting from the service's starting point, in seconds).

Solution? There is none! This is your job! What are you using? Simple Correlations? Spatial discretization… maybe? How about the regular services… are they worthy to explore? Is Bayesian learning a fair approach for the destination mining? Is the destination related to the travel time? How? Bring us your solution!