Integrating vehicle dimension features for vision-based traffic density prediction using YOLOv5-LSTM architecture
DOI:
https://doi.org/10.52465/joscex.v7i2.119Keywords:
Traffic density prediction, Vehicle dimension features, YOLO v5, LSTMAbstract
Traffic congestion in urban areas requires intelligent technology-based solutions to support modern transportation systems. This study proposes a vision-based traffic congestion prediction framework that integrates YOLOv5 with a sequential deep learning model to improve forecasting accuracy. YOLOv5 is used for real-time vehicle detection, while the width and height of the bounding box are extracted as spatial occupancy features to provide additional information beyond conventional vehicle counting methods. Experiments are conducted using six urban traffic videos consisting of 90,012 frames collected under various traffic conditions. The extracted features are converted into sequential temporal records and subsequently used to train Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU) models. Model performance is evaluated using Mean Squared Error (MSE), Root Mean Squared Error (RMSE), and Mean Absolute Error (MAE). Experimental results show that both models achieve competitive performance for traffic congestion forecasting. LSTM achieved the best performance with an MSE of 3.77, an RMSE of 1.94, and an MAE of 1.47, demonstrating its superior ability to capture long-term temporal dependencies in large-scale sequential traffic data. In contrast, GRU exhibited lower computational complexity and faster inference time due to its simpler architecture. These findings suggest that integrating vehicle dimensional features with sequential deep learning models provides a more effective approach to artificial intelligence.
Downloads
Published
Issue
Section
License
Copyright (c) 2026 Journal of Soft Computing Exploration

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
