TrafficFormer: An Efficient Pre-trained Model for Traffic Data

Authors: Guangmeng Zhou, Xiongwen Guo, Zhuotao Liu, Tong Li, Qi Li, Ke Xu

Abstract

Traffic data contains deep domain-specific knowledge, making labeling challenging, and the lack of labeled data adversely impacts the accuracy of learning-based traffic analysis. The pre-training technology is widely adopted in the fields of vision and natural language to address the problem of limited labeled data. However, the exploration in the domain of traffic analysis remains insufficient. This paper proposes an efficient pre-training model, TrafficFormer, for traffic data. In the pre-training stage, TrafficFormer introduces a fine-grained multi-classification task to enhance the representation capabilities of traffic data; in the fine-tuning stage, TrafficFormer proposes a traffic data augmentation method utilizing the random initialization feature of fields, which helps the traffic model focus on key information. We evaluate TrafficFormer using both traffic classification tasks and protocol understanding tasks. The experimental results show that TrafficFormer achieves superior performance on six traffic classification datasets, with improvements of up to 10% in the F1 score and demonstrates significantly superior protocol understanding capabilities compared to existing traffic pre-training models.

DOI Link

TrafficFormer: An Efficient Pre-trained Model for Traffic Data

TrafficFormer: An Efficient Pre-trained Model for Traffic Data

Abstract

Current and Past Affiliations