A Privacy-Preserving Vehicle Trajectory Clustering Framework

Author(s):  Ran Tian, Pulun Gao, Yanxing Liu

Affiliation(s):  Department of Computer Science and Engineering, Northwest Normal University, Lanzhou, CO 730070 China

Corresponding email(s):   tianran@nwnu.edu.cn, 202031603111@nwnu.edu.cn, lyanxing@nwnu.edu.cn

Key Words:  Privacy protection, Variational AutoEncoder, Improved K-means, Vehicle trajectory clustering

As one of the essential tools for spatio-temporal traffic data mining, vehicle trajectory clustering is widely used to mine the behavior patterns of vehicles. However, uploading original vehicle trajectory data to the server and clustering carries the risk of privacy leakage. Therefore, one of the current challenges is determining how to perform vehicle trajectory clustering while protecting users’ privacy. We propose a privacy-preserving vehicle trajectory clustering framework. In the framework, the client calculates the hidden variables of the vehicle trajectory and uploads the variables to the server, which uses the hidden variables for clustering analysis and delivers the analysis results to the client. The specific algorithm deployed to the framework is improved K-means based on a variational AutoEncoder (IKV). The IKV' workflow is as follows: firstly, we train the variational AutoEncoder (VAE) with historical vehicle trajectory data. When VAE's decoder can approximate the original data, the encoder is deployed to the edge computing device; secondly, the edge device transmits the hidden variable to the server; finally, the clustering is performed utilizing improved K-means, which prevents the leakage of the vehicle trajectory. IKV was compared to numerous clustering methods on three datasets. It was found that 75% of IKV's clustering results are optimal or suboptimal, and 77.78% of IKV's clustering results are more stable. Therefore, the proposed framework can be applied to privacy-conscious production environments, such as carpooling tasks. Moreover, due to the low sensitivity to the number of cluster centers, the proposed framework can be applied to clustering tasks of different magnitudes.

