Autoregressive Tensor Decomposition for NYC Taxi Data Analysis
thesisposted on 31.07.2020, 18:06 by Zongwei Li
Cities have adopted evolving urban digitization strategies, and most of those increasingly focus on data, especially in the field of public transportation. Transportation data have intuitively spatial and temporal characteristics, for they are often described with when and where the trips occur. Since a trip is often described with many attributes, the transportation data can be presented with a tensor, a container which can house data in $N$-dimensions. Unlike a traditional data frame, which only has column variables, tensor is intuitively more straightforward to explore spatio-temporal data-sets, which makes those attributes more easily interpreted. However, it requires unique techniques to extract useful and relatively correct information in attributes highly correlated with each other. This work presents a mixed model consisting of tensor decomposition combined with seasonal vector autoregression in time to find latent patterns within historical taxi data classified by types of taxis, pick-up and drop-off times of services in NYC, so that it can help predict the place and time where taxis are demanded. We validated the proposed approach using the experiment evaluation with real NYC tax data. The proposed method shows the best prediction among alternative models without geographical inference, and captures the daily patterns of taxi demands for business and entertainment needs.