In this case study, we're diving into how to train a YOLOv8-seg model, to get better at recognizing specific kinds of objects. We're focusing on corals here, but the cool thing is, what we learn can be used on just about any dataset you have in mind. This makes our approach versatile for anyone looking to dial in the accuracy of their model for particular objects.
Before going through a step-by-step approach with all parameter's details, let's dive deeper into segmentation and YOLOv8.
Semantic segmentation is concerned with classifying each pixel of an image into predefined categories, effectively segmenting the image into regions that correspond to different classes. However, it treats all instances of a particular class as a single entity. For example, in an image containing several cars, semantic segmentation labels all cars collectively, without distinguishing between individual vehicles.
Instance segmentation is a Computer Vision task that involves identifying and delineating individual objects within an image. Unlike semantic segmentation, which classifies each pixel into pre-defined categories, instance segmentation aims to differentiate and separate instances of objects from one another.
In instance segmentation, the goal is to not only classify each pixel but also assign a unique label or identifier to each distinct object instance. This means that objects of the same class are treated as separate entities. For example, if there are multiple instances of cars in an image, instance segmentation algorithms will assign a unique label to each car, allowing for precise identification and differentiation.
Instance segmentation provides more detailed and granular information about object boundaries and spatial extent compared to other segmentation techniques. It is widely used in various applications, including autonomous driving, robotics, object detection, medical imaging, and video analysis.
Many modern instance segmentation algorithms, like YOLOv8 seg, employ deep learning techniques, particularly convolutional neural networks (CNNs), to perform pixel-wise classification and object localization simultaneously. These algorithms often combine the strengths of object detection and semantic segmentation to achieve accurate instance-level segmentation results.
YOLOv8, developed by Ultralytics [1], is a model that specializes in object detection, image classification, and instance segmentation tasks. It is known for its accuracy and compact model size, making it a notable addition to the YOLO series, which has seen success with YOLOv5. With its improved architecture and user-friendly enhancements, YOLOv8 offers a great option for Computer Vision projects.
While an official research paper for YOLOv8 is currently unavailable, an analysis of the repository and available information provide insights on its architecture. YOLOv8 introduces anchor-free detection, which predicts object centers instead of relying on anchor boxes. This approach simplifies the model and improves post-processing steps like Non-Maximum Suppression.
The architecture also incorporates new convolutions and module configurations, leaning towards a ResNet-like structure. For a detailed visualization of the network's architecture, refer to the image created by GitHub user RangeKing.
The training routine of YOLOv8 incorporates mosaic augmentation, where multiple images are combined to expose the model to variations in object locations, occlusion, and surrounding pixels. However, this augmentation is turned off during the final training epochs to prevent performance degradation.
The accuracy improvements of YOLOv8 have been validated on the widely used COCO benchmark, where the model achieves impressive mean Average Precision (mAP) scores. For instance, the YOLOv8m-seg model achieves a remarkable 49.9% mAP on COCO. The following table provides a summary of the model sizes, mAP scores, and other performance metrics for different variants of YOLOv8-seg:
Here is an example of outputs using YOLOv8x detection and instance segmentation models:
The development of YOLOv8-seg represents a significant advancement in tackling the challenges posed by instance segmentation. This iteration of the YOLO series enhances the model's capability to accurately identify and delineate individual objects within complex images. Such an advancement is crucial for applications where high precision in object detection and classification is paramount.
Consider the application in autonomous driving, where the accurate and real-time differentiation between various entities like vehicles, pedestrians, and obstacles is critical for the safety and operational efficiency of autonomous systems. YOLOv8-seg's adeptness at instance segmentation enables a nuanced understanding of the vehicle's immediate environment, facilitating improved decision-making and collision avoidance mechanisms.
Similarly, in the field of medical imaging, the ability to perform instance segmentation allows for the granular analysis of medical scans. YOLOv8-seg can be employed to accurately distinguish between different types of tissues or to identify and measure pathological features with precision. This capability is invaluable for diagnosing conditions, planning treatments, and tracking disease progression with a higher degree of accuracy than previously possible.
By enhancing instance segmentation capabilities, YOLOv8-seg equips developers and researchers with a potent tool that significantly advances the performance of computer vision applications. Its innovations not only refine the process of object detection and classification but also pave the way for new developments across various industries that rely on sophisticated and reliable image analysis techniques.
The Ikomia API allows to train and infer YOLOv8-seg with minimal coding.
The Ikomia API simplifies the development of Computer Vision workflows and allows for easy experimentation with different parameters to achieve the best results.
With the Ikomia API, we can train a custom YOLOv8 Instance Segmentation model with just a few lines of code. To get started, you need to install the API in a virtual environment [4].
In this tutorial, we will use the coral dataset from Roboflow [3].
Note: The original dataset initially used is no longer accessible, and an alternative dataset has been provided. This explains the differences in images between the article and the current dataset.
A coral fine-tuned instance segmentation model is could be used in the field of marine biology and environmental conservation, focusing on the detailed analysis and monitoring of coral reefs. Instance segmentation goes beyond just detecting objects (in this case, corals) within an image; it also precisely delineates the boundaries of each coral instance. This fine-grained approach enables a range of applications:
You can also charge directly the open-source notebook we have prepared.
The training process for 50 epochs was completed in approximately 1h using an NVIDIA GeForce RTX 3060 Laptop GPU with 6143.5MB.
With the dataset of aerial images that you downloaded, you can train a custom YOLOv8 model using the Ikomia API.
We initialize a workflow instance. The “wf” object can then be used to add tasks to the workflow instance, configure their parameters, and run them on input data.
The downloaded COCO dataset includes two main formats: .JSON and image files. Images are split into train, val, test folders, with each associated a .json file containing the images annotations:
We will use the dataset_coco module provided by Ikomia API to load the custom data and annotations.
We add the ‘train_yolo_v8_seg’ task to our workflow for training our custom YOLOv8-seg model. To customize our training, we specify the following parameters:
Here are the configurable parameters and their respective descriptions:
You also have the option to modify the following parameters:
Finally, we run the workflow to start the training process.
You can monitor the progress of your training using tools like Tensorboard or MLflow.
Once the training is complete, the train_yolo_v8_seg task will save the best model in a folder named with a timestamp inside the output_folder. You can find your best.pt model in the weights folder of the time-stamped folder.
First, can we run a coral image on the pre-trained YOLOv8-seg model:
We can observe that the infer_yolo_v8_seg default pre-trained mistake a coral for a bear. This is because the model has been trained on the COCO dataset, which does not contain any coral objects.
To test the model we just trained, we specify the path to our custom model using the ’model_weight_file’ argument. We then run the workflow on the same image we used previously.
Comparing our results to the ground truth, we successfully identified the species Orbicella spp. Nevertheless, we did observe some instances of false negatives. To enhance the performance of our custom model, further training for additional epochs and augmenting our dataset with more images could be beneficial.
Another example showcasing effective detection results is demonstrated with the Agaricia agaricites species:
This article we showed into the mechanics and advantages of YOLOv8, showcasing its simplicity, versatility, and scalability. We have seen how you can fine-tune any instance segmentation algorithms implemented by Ikomia with just a few lines of code.
To learn more about the API, you can refer to the documentation. Additionally, you can explore the list of state-of-the-art algorithms on Ikomia HUB and try out Ikomia STUDIO, which provides a user-friendly interface with the same features as the API.
YOLOv8 instance segmentation is a Computer Vision technique that identifies and delineates individual objects within an image. Unlike semantic segmentation, which classifies each pixel into predefined categories, instance segmentation distinguishes between different instances of objects, allowing for precise identification and separation. Training YOLOv8 for instance segmentation enhances this capability, making it suitable for detailed object recognition.
YOLOv8, developed by Ultralytics, specializes in object detection, image classification, and instance segmentation tasks. It is known for its accuracy and compact model size. Unlike previous versions, YOLOv8 introduces anchor-free detection and new convolution configurations, improving its performance and simplifying post-processing steps like Non-Maximum Suppression. Training YOLOv8 involves fine-tuning these features to enhance its capabilities.
You can easily train YOLOv8 instance segmentation using the Ikomia API. Here are the basic steps:
After training YOLOv8, you can test your model by running it on a sample image and comparing the results to the ground truth. Specify the path to your custom model and use the ‘infer_yolo_v8_seg’ task to perform instance segmentation on new images.
The time to train YOLOv8 can vary based on hardware, dataset size, number of epoch and pre-trained model size. For example, training YOLOv8m for 50 epochs on an NVIDIA GeForce RTX 3060 Laptop GPU (6GB) took approximately 1 hour.
[1] https://github.com/ultralytics/ultralytics
[2] https://github.com/ultralytics/ultralytics/issues/189
[3] Coral Dataset