Maneuver-based Trajectory Prediction for Self-driving Cars Using Spatio-temporal Convolutional Networks

Roughly 1.3 million people die every year in car accidents around the world. 90% of car accidents are caused by human error. Self-driving cars would not only make the world a safer place, but also make transport and travel more predictable and more efficient. This efficiency would benefit the economy and social sector on a global scale.

Image courtesy of Pexels, free licence

Socio-Economical Impact of Self-Driving Cars

In the famous quote by US President John F Kennedy, “American roads are not good because America is rich, but America is rich because American roads are good,” Mr. Kennedy was probably referring to the efficiency in logistics that he claimed was the reason for America’s prosperity. Imagine translating this logistics efficiency (and hence prosperity) with self-driving cars across the world.

Quite understandably, autonomous cars have been the center of research for giants like Google, Uber, Tesla, etc. It is, however, still a hard task to achieve human-level performance. The complexity associated with trajectory prediction is a challenging problem primarily caused by

  • Interdependencies between vehicle behaviors
  • The multimodal nature of future intentions driving different vehicles
  • Dynamic and complex driving environment

These challenges have been discussed and a model is proposed for predicting trajectory in a research paper by Benedikt Mersch, Thomas Hollen, Kun Zhao, Cyrill Stachniss, Ribana Roscher titled “Maneuver-based Trajectory Prediction for Self-driving Cars Using Spatio-temporal Convolutional Networks” that forms the basis of the following text.

Image credit: arXiv:2109.07365

Understanding the problem

Goal: Predict the trajectory for vehicle T, including probable maneuvers.

Experiment Setting: As vehicle T approaches vehicle F, it could either slow down OR change its lane. If its driver decides to change its lane, the trajectory of T is going to depend on a lot of factors, particularly:

  • Position
  • Speed
  • Acceleration
  • Intentions of drivers

Researchers’ approach

As close neighbors influence the possible actions a vehicle can take, researchers take the target vehicle’s direct neighborhood into account to predict a trajectory from past data. The key idea for the proposed model is to classify the lane change maneuver of a target vehicle for each step into the future and then predict the corresponding trajectory.

Training & Evaluation: The researchers have used the below datasets to train & evaluate the proposed approach on real-world scenarios

  • The highD dataset contains 110,000 vehicle trajectories from 60 highway recordings. The highways are located in Germany and have 2 to 3 lanes.
  • The NGSIM dataset with 9,206 vehicles driving on two different US highways with 5 to 6 lanes.
  • Both datasets provide access to position, velocity, and acceleration. For each dataset, the researchers have used 70% of all vehicle trajectories for training, 10% for validation, and 20% for testing.
  • It is also worth noting that the temporal and spatial relations learning process among all vehicles happens jointly. It makes the prediction fast since no previous hidden states need to be computed and taken into account in contrast to recurrent models.

Implementation: The proposed model is implemented using Spatio-temporal Convolutional Networks. The results are evaluated on the above-mentioned highway datasets.

Results: The model can achieve a competitive prediction performance.

Conclusion: The research paper presents a novel approach to classify the future lane change maneuver of a target vehicle and predict the corresponding trajectory. In the words of the researchers,

The main contributions of this paper towards vehicle motion prediction are two-fold. First, we present a novel semantic neighborhood representation of the scene around a target vehicle for joint aggregation of higher-level features in prediction tasks. The memory-efficient and dense 3D tensor encodes the time, neighbor positions, and past vehicle states as the dynamic context. Second, we propose the use of two 2D convolutional neural networks (CNNs) for joint spatio-temporal feature extraction from the proposed input representation. Our approach explicitly uses convolutions across time and the space of neighboring vehicles. To this end, we classify the future lane change intention of a target vehicle with respect to a lateral motion and then predict a trajectory based on the classified lane change intention. This yields an approach that is able to (i) successfully classify lane change maneuvers and predict a corresponding trajectory for real-world scenarios by (ii) performing joint spatio-temporal feature aggregation with 2D CNNs, (iii) and outperform state-of-the-art methods. These three main claims are backed up by the paper, our experimental evaluation, and an ablation study.

Source: Benedikt Mersch, Thomas Hollen, Kun Zhao, Cyrill Stachniss, Ribana Roscher “Maneuver-based Trajectory Prediction for Self-driving Cars Using Spatio-temporal Convolutional Networks”. Link:


Notify of
Inline Feedbacks
View all comments
Would love your thoughts, please comment.x