Integrating vehicle dimension features for vision-based traffic density prediction using YOLOv5-LSTM architecture

Filda Angellia; Nita Merlina; Agus Subekti; Rahmadya Trias Handayanto

doi:10.52465/joscex.v7i2.119

Authors

Filda Angellia Faculty of Information Technology, Universitas Nusa Mandiri, Indonesia Author
Nita Merlina Faculty of Information Technology, Universitas Nusa Mandiri, Indonesia Author
Agus Subekti Faculty of Information Technology, Universitas Nusa Mandiri, Indonesia Author
Rahmadya Trias Handayanto Faculty of Information Technology, Universitas Nusa Mandiri, Indonesia Author

DOI:

https://doi.org/10.52465/joscex.v7i2.119

Keywords:

Traffic density prediction, Vehicle dimension features, YOLO v5, LSTM

Abstract

Traffic congestion in urban areas requires intelligent technology-based solutions to support modern transportation systems. This study proposes a vision-based traffic congestion prediction framework that integrates YOLOv5 with a sequential deep learning model to improve forecasting accuracy. YOLOv5 is used for real-time vehicle detection, while the width and height of the bounding box are extracted as spatial occupancy features to provide additional information beyond conventional vehicle counting methods. Experiments are conducted using six urban traffic videos consisting of 90,012 frames collected under various traffic conditions. The extracted features are converted into sequential temporal records and subsequently used to train Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU) models. Model performance is evaluated using Mean Squared Error (MSE), Root Mean Squared Error (RMSE), and Mean Absolute Error (MAE). Experimental results show that both models achieve competitive performance for traffic congestion forecasting. LSTM achieved the best performance with an MSE of 3.77, an RMSE of 1.94, and an MAE of 1.47, demonstrating its superior ability to capture long-term temporal dependencies in large-scale sequential traffic data. In contrast, GRU exhibited lower computational complexity and faster inference time due to its simpler architecture. These findings suggest that integrating vehicle dimensional features with sequential deep learning models provides a more effective approach to artificial intelligence.