YOLOv8 Object Detection for Autonomous Driving

2D Object Detection for Autonomous Driving

Project Overview

Objective: The goal of this project is to develop a robust object detection system for autonomous driving using the YOLOv8 architecture. The system should accurately identify and locate objects within an image, facilitating safe and efficient autonomous vehicle operation.

Problem Description: Object detection is crucial for autonomous driving, enabling the vehicle to perceive its surroundings and make informed decisions. This project leverages the KITTI dataset, a benchmark in autonomous driving research, to train and evaluate the YOLOv8 model for 2D object detection tasks.

Methodology

Dataset

KITTI Dataset Description

The KITTI dataset, developed by the Karlsruhe Institute of Technology and the Toyota Technological Institute at Chicago, provides a comprehensive suite of data collected from various sensor modalities. It is widely used for research in computer vision and autonomous driving.

Sample output of the KITTI Dataset
Figure 4: Sample output of the KITTI Dataset

Data Collection

Data Preprocessing and Augmentation

Model Architecture

YOLOv8 Architecture

Backbone

The backbone is the convolutional neural network (CNN) responsible for extracting features from the input image. YOLOv8 uses a custom CSPDarknet53 backbone, which employs cross-stage partial connections to improve information flow between layers and boost accuracy.

Neck

The neck merges feature maps from different stages of the backbone to capture information at various scales. YOLOv8 utilizes a novel C2f module instead of the traditional Feature Pyramid Network (FPN), combining high-level semantic features with low-level spatial information.

Head

The head is responsible for making predictions. YOLOv8 employs multiple detection modules that predict bounding boxes, objectness scores, and class probabilities for each grid cell in the feature map.

YOLOv8 Architecture
Figure 6: YOLOv8 Architecture

Training Process

The KITTI dataset, which is around 22GB, was used with 80% of the data allocated for training and 20% for validation. The following hyperparameters were configured for training the YOLOv8 model:

Evaluation and Results

Evaluation Metrics

Quantitative Results
Table 1: Evaluation Metrics for YOLOv8 on KITTI Dataset

Project Media

Project 1
Image showing detection of cars.
Project 1
Image shows detection of people in my university.
Project 1
This Image Shows the detection of different objects, like bikes
Project 1
The Image shows the detection of people.
Project 1
Image showing the detection of objects on the road like the bus coming.
Project 1
Image showing the real time detection of objects in the terminal processing
Project 1
This images showing the different sensors on the car used for object detection during Autonomous Driving.

Discussion

Challenges and Limitations

Future Work

Conclusion

In this project, I explored the application of YOLOv8 for 2D object detection in autonomous driving using the KITTI dataset. By preprocessing LiDAR point clouds into image representations, I leveraged the strengths of YOLOv8 to detect and classify objects in a 2D context. My experiments demonstrated that the CSPDarknet53 backbone achieves the highest detection accuracy, making it suitable for applications where precision is paramount. Conversely, MobileNetV2 offers the best real-time performance, ideal for scenarios requiring immediate processing.

Back to Projects