How to train YOLOv8 instance segmentation on a custom dataset

Allan Kouidri
-
7/18/2023
How to train YOLOv8 instance segmentation on a custom dataset using the Ikomia API

In this case study, we're diving into how to train a YOLOv8-seg model, to get better at recognizing specific kinds of objects. We're focusing on corals here, but the cool thing is, what we learn can be used on just about any dataset you have in mind. This makes our approach versatile for anyone looking to dial in the accuracy of their model for particular objects.

What is YOLOv8 instance segmentation?

Before going through a step-by-step approach with all parameter's details, let's dive deeper into segmentation and YOLOv8.

Semantic Segmentation

Semantic segmentation is concerned with classifying each pixel of an image into predefined categories, effectively segmenting the image into regions that correspond to different classes. However, it treats all instances of a particular class as a single entity. For example, in an image containing several cars, semantic segmentation labels all cars collectively, without distinguishing between individual vehicles.

Instance Segmentation

Instance segmentation is a Computer Vision task that involves identifying and delineating individual objects within an image. Unlike semantic segmentation, which classifies each pixel into pre-defined categories, instance segmentation aims to differentiate and separate instances of objects from one another.

In instance segmentation, the goal is to not only classify each pixel but also assign a unique label or identifier to each distinct object instance. This means that objects of the same class are treated as separate entities. For example, if there are multiple instances of cars in an image, instance segmentation algorithms will assign a unique label to each car, allowing for precise identification and differentiation.

Instance segmentation vs semantic segmentation

Instance segmentation provides more detailed and granular information about object boundaries and spatial extent compared to other segmentation techniques. It is widely used in various applications, including autonomous driving, robotics, object detection, medical imaging, and video analysis.

Many modern instance segmentation algorithms, like YOLOv8 seg, employ deep learning techniques, particularly convolutional neural networks (CNNs), to perform pixel-wise classification and object localization simultaneously. These algorithms often combine the strengths of object detection and semantic segmentation to achieve accurate instance-level segmentation results.

What is YOLOv8?

Release and benefits

YOLOv8, developed by Ultralytics [1], is a model that specializes in object detection, image classification, and instance segmentation tasks. It is known for its accuracy and compact model size, making it a notable addition to the YOLO series, which has seen success with YOLOv5. With its improved architecture and user-friendly enhancements, YOLOv8 offers a great option for Computer Vision projects.

Comparison with other real-time object detectors: YOLOv8 achieves state-of-the-art (SOTA) performance. [1]

Architecture and innovations

While an official research paper for YOLOv8 is currently unavailable, an analysis of the repository and available information provide insights on its architecture. YOLOv8 introduces anchor-free detection, which predicts object centers instead of relying on anchor boxes. This approach simplifies the model and improves post-processing steps like Non-Maximum Suppression.

The architecture also incorporates new convolutions and module configurations, leaning towards a ResNet-like structure. For a detailed visualization of the network's architecture, refer to the image created by GitHub user RangeKing.

YOLOv8 model structure (non-official) [2]

Training routine and augmentation

The training routine of YOLOv8 incorporates mosaic augmentation, where multiple images are combined to expose the model to variations in object locations, occlusion, and surrounding pixels. However, this augmentation is turned off during the final training epochs to prevent performance degradation.

Accuracy and performance

The accuracy improvements of YOLOv8 have been validated on the widely used COCO benchmark, where the model achieves impressive mean Average Precision (mAP) scores. For instance, the YOLOv8m-seg model achieves a remarkable 49.9% mAP on COCO. The following table provides a summary of the model sizes, mAP scores, and other performance metrics for different variants of YOLOv8-seg:

Model size (pixels) mAPval 50-95 mAPmask 50-95 Speed  (ms) params (M) FLOPs (B)
YOLOv8n-seg 640 36.7 30.5 1.21 3.4 12.6
YOLOv8s-seg 640 44.6 36.8 1.47 11.8 42.6
YOLOv8m-seg 640 49.9 40.8 2.18 27.3 110.2
YOLOv8l-seg 640 52.3 42.6 2.79 46 220.5
YOLOv8x-seg 640 53.4 43.4 4.02 71.8 344.1

Here is an example of outputs using YOLOv8x detection and instance segmentation models:

YOLOv8 instance segmentation group of people

YOLOv8-seg's Contribution to Instance Segmentation

The development of YOLOv8-seg represents a significant advancement in tackling the challenges posed by instance segmentation. This iteration of the YOLO series enhances the model's capability to accurately identify and delineate individual objects within complex images. Such an advancement is crucial for applications where high precision in object detection and classification is paramount.

In Autonomous Driving

Consider the application in autonomous driving, where the accurate and real-time differentiation between various entities like vehicles, pedestrians, and obstacles is critical for the safety and operational efficiency of autonomous systems. YOLOv8-seg's adeptness at instance segmentation enables a nuanced understanding of the vehicle's immediate environment, facilitating improved decision-making and collision avoidance mechanisms.

In Medical Imaging

Similarly, in the field of medical imaging, the ability to perform instance segmentation allows for the granular analysis of medical scans. YOLOv8-seg can be employed to accurately distinguish between different types of tissues or to identify and measure pathological features with precision. This capability is invaluable for diagnosing conditions, planning treatments, and tracking disease progression with a higher degree of accuracy than previously possible.

By enhancing instance segmentation capabilities, YOLOv8-seg equips developers and researchers with a potent tool that significantly advances the performance of computer vision applications. Its innovations not only refine the process of object detection and classification but also pave the way for new developments across various industries that rely on sophisticated and reliable image analysis techniques.

Easily train YOLOv8 instance segmentation on a custom dataset

The Ikomia API allows to train and infer YOLOv8-seg with minimal coding.

Setup

The Ikomia API simplifies the development of Computer Vision workflows and allows for easy experimentation with different parameters to achieve the best results.

With the Ikomia API, we can train a custom YOLOv8 Instance Segmentation model with just a few lines of code. To get started, you need to install the API in a virtual environment [4].


pip install ikomia 

Dataset

In this tutorial, we will use the coral dataset from Roboflow [3].

Note: The original dataset initially used is no longer accessible, and an alternative dataset has been provided. This explains the differences in images between the article and the current dataset.

Custom dataset to train YOLOv8

A coral fine-tuned instance segmentation model is could be used in the field of marine biology and environmental conservation, focusing on the detailed analysis and monitoring of coral reefs. Instance segmentation goes beyond just detecting objects (in this case, corals) within an image; it also precisely delineates the boundaries of each coral instance. This fine-grained approach enables a range of applications:

  1. Coral Health Assessment: By segmenting individual corals, researchers can assess the health of each coral, identifying signs of bleaching, disease, or damage. This information is critical for monitoring the impacts of climate change, pollution, and other stressors on coral reefs.
  2. Biodiversity Analysis: Coral reefs are among the most diverse ecosystems on the planet. Instance segmentation models can help in quantifying biodiversity by identifying and counting the number of different coral species within an area. This aids in understanding the composition and resilience of coral communities.
  3. Habitat Mapping: Detailed maps of coral reef ecosystems can be created using instance segmentation, providing valuable information for conservation planning, marine park design, and impact assessment of human activities.
  4. Monitoring Coral Growth and Recovery: By applying instance segmentation models over time, scientists can track the growth rates of individual corals and the recovery of reefs following disturbances or conservation interventions.

Train YOLOv8 instance segmentation with a few lines of code

You can also charge directly the open-source notebook we have prepared.


from ikomia.dataprocess.workflow import Workflow


# Initialize the workflow
wf = Workflow()

# Add the dataset loader to load your custom data and annotations
dataset = wf.add_task(name='dataset_coco')

# Set the parameters of the dataset loader
dataset.set_parameters({
    'json_file': 'Path/To/Coral_Segmentation.v7/Dataset/train/_annotations.coco.json',
    'image_folder': 'Path/To/Coral_Segmentation.v7/Dataset/train',
    'task': 'instance_segmentation',
}) 

# Add the YOLOv8 segmentation algorithm
train = wf.add_task(name='train_yolo_v8_seg', auto_connect=True)

# Set the parameters of the YOLOv8 segmentation algorithm
train.set_parameters({
    'model_name': 'yolov8m-seg',
    'batch_size': '4',
    'epochs': '50',
    'input_size': '640',
    'dataset_split_ratio': '0.8',
    'output_folder':'Path/To/Folder/Where/Model-weights/Will/Be/Saved'
})

The training process for 50 epochs was completed in approximately 1h using an NVIDIA GeForce RTX 3060 Laptop GPU with 6143.5MB.

Step-by-step: Fine-tune a pre-trained YOLOv8-seg model using Ikomia API

With the dataset of aerial images that you downloaded, you can train a custom YOLOv8 model using the Ikomia API. 

Step 1: import and create workflow


from ikomia.dataprocess.workflow import Workflow

wf = Workflow()

  • Workflow is the base object to create a workflow. It provides methods for setting inputs such as images, videos, and directories, configuring task parameters, obtaining time metrics, and accessing specific task outputs such as graphics, segmentation masks, and texts.

We initialize a workflow instance. The “wf” object can then be used to add tasks to the workflow instance, configure their parameters, and run them on input data.

Step 2: add the dataset loader

The downloaded COCO dataset includes two main formats: .JSON and image files. Images are split into train, val, test folders, with each associated a .json file containing the images annotations:

  • Image file name
  • Image size (width and height)
  • List of objects with the following information: Object class (e.g., "person," "car"); Bounding box coordinates (x, y, width, height) and Segmentation mask (polygon)

We will use the dataset_coco module provided by Ikomia API to load the custom data and annotations.


# Add the dataset loader to load your custom data and annotations
dataset = wf.add_task(name='dataset_coco')

# Set the parameters of the dataset loader
dataset.set_parameters({
    'json_file': 'Path/To/Mesophotic Coral/Dataset/train/_annotations.coco.json',
    'image_folder': 'Path/To/Mesophotic Coral/Dataset/train',
    'task': 'instance_segmentation'
}) 

Step 3: add the YOLOv8 segmentation model and set the parameters

We add the ‘train_yolo_v8_seg’ task to our workflow for training our custom YOLOv8-seg model. To customize our training, we specify the following parameters:


# Add the YOLOv8 segmentation algorithm
train = wf.add_task(name='train_yolo_v8_seg', auto_connect=True)

# Set the parameters of the YOLOv8 segmentation algorithm
train.set_parameters({
    'model_name': 'yolov8m-seg',
    'batch_size': '4',
    'epochs': '50',
    'input_size': '640',
    'dataset_split_ratio': '0.8',
    'output_folder':'Path/To/Folder/Where/Model-weights/Will/Be/Saved'
}) 

Here are the configurable parameters and their respective descriptions:

  • batch_size: Number of samples processed before the model is updated.
  • epochs: Number of complete passes through the training dataset.
  • input_size: Input image size during training and validation.
  • dataset_split_ratio: the algorithm automatically divides the dataset into train and evaluation sets. A value of 0.8 means the use of 80% of the data for training and 20% for evaluation.

You also have the option to modify the following parameters:

  • workers: Number of worker threads for data loading. Currently set to '0'.
  • optimizer: The optimizer to use. Available choices include SGD, Adam, Adamax, AdamW, NAdam, RAdam, RMSProp, and auto.
  • weight_decay: The weight decay for the optimizer. Currently set to '5e-4'.
  • momentum: The SGD momentum/Adam beta1 value. Currently set to '0.937'.
  • lr0: Initial learning rate. For SGD, it is set to 1E-2, and for Adam, it is set to 1E-3.
  • lrf: Final learning rate, calculated as lr0 * lrf. Currently set to '0.01'.

Step 4: run your workflow

Finally, we run the workflow to start the training process.


wf.run()

You can monitor the progress of your training using tools like Tensorboard or MLflow.

Once the training is complete, the train_yolo_v8_seg task will save the best model in a folder named with a timestamp inside the output_folder. You can find your best.pt model in the weights folder of the time-stamped folder.

Test your fine-tuned YOLOv8-seg model

First, can we run a coral image on the pre-trained YOLOv8-seg model:


from ikomia.dataprocess.workflow import Workflow
from ikomia.utils.displayIO import display


# Initialize the workflow
wf = Workflow()

# Add the YOLOv8 segmentation alrogithm
yolov8seg = wf.add_task(name='infer_yolo_v8_seg', auto_connect=True)

# Set the parameters of the YOLOv8 segmentation algorithm
yolov8seg.set_parameters({
    'model_name': 'yolov8m-seg',
    'conf_thres': '0.2',
    'iou_thres': '0.7'
}) 

# Run on your image
wf.run_on(path="Path/To/Mesophotic Coral Identification.v1i.coco-segmentation/test/Image_4_jpg.rf.7873f786060e89a0f071ff18d377db0f.jpg")

# Inspect your results
display(yolov8seg.get_image_with_mask_and_graphics())

We can observe that the infer_yolo_v8_seg default pre-trained mistake a coral for a bear. This is because the model has been trained on the COCO dataset, which does not contain any coral objects. 

To test the model we just trained, we specify the path to our custom model using the ’model_weight_file’ argument. We then run the workflow on the same image we used previously.


# Set the path of you custom YOLOv8-seg model to the parameter
yolov8seg.set_parameters({
    'model_weight_file': 'Path/To/Output_folder/[timestamp]/train/weights/best.pt',
    'conf_thres': '0.5',
    'iou_thres': '0.7'
}) 


train YOLOv8 inference

Comparing our results to the ground truth, we successfully identified the species Orbicella spp. Nevertheless, we did observe some instances of false negatives. To enhance the performance of our custom model, further training for additional epochs and augmenting our dataset with more images could be beneficial.

Another example showcasing effective detection results is demonstrated with the Agaricia agaricites species:

inference output of the train YOLOv8 model

Start training easily with Ikomia

This article we showed into the mechanics and advantages of YOLOv8, showcasing its simplicity, versatility, and scalability. We have seen how you can fine-tune any instance segmentation algorithms implemented by Ikomia with just a few lines of code.

To learn more about the API, you can refer to the documentation. Additionally, you can explore the list of state-of-the-art algorithms on Ikomia HUB and try out Ikomia STUDIO, which provides a user-friendly interface with the same features as the API.

FAQ: Training YOLOv8 for Instance Segmentation

What is YOLOv8 Instance Segmentation?

YOLOv8 instance segmentation is a Computer Vision technique that identifies and delineates individual objects within an image. Unlike semantic segmentation, which classifies each pixel into predefined categories, instance segmentation distinguishes between different instances of objects, allowing for precise identification and separation. Training YOLOv8 for instance segmentation enhances this capability, making it suitable for detailed object recognition.

What is YOLOv8 and how does it differ from other object detectors?

YOLOv8, developed by Ultralytics, specializes in object detection, image classification, and instance segmentation tasks. It is known for its accuracy and compact model size. Unlike previous versions, YOLOv8 introduces anchor-free detection and new convolution configurations, improving its performance and simplifying post-processing steps like Non-Maximum Suppression. Training YOLOv8 involves fine-tuning these features to enhance its capabilities.

How can I train YOLOv8 instance segmentation on a custom dataset?

You can easily train YOLOv8 instance segmentation using the Ikomia API. Here are the basic steps:

  1. Setup: Install the Ikomia API in a virtual environment.
  2. Dataset Preparation: Use a dataset in YOLO darknet, COCO or Pascal VOC format.
  3. Workflow Creation: Initialize a workflow instance and configure the parameters for training YOLOv8.
  4. Dataset Loading: Load the custom data and annotations using a dataset loader.
  5. Model Training: Add the ‘train_yolo_v8_seg’ task to your workflow and specify training parameters such as batch size, epochs, and input size.
  6. Run Training: Execute the workflow and monitor the progress using tools like Tensorboard or MLflow.

What are some common training parameters for YOLOv8 instance segmentation?

  • Batch Size: Number of samples processed before the model is updated.
  • Epochs: Number of complete passes through the training dataset.
  • Input Size: Input image size during training and validation.
  • Dataset Split Ratio: Ratio for dividing the dataset into training and evaluation sets.
  • Optimizer: Choice of optimizer (e.g., SGD, Adam).
  • Learning Rate (lr0 and lrf): Initial and final learning rates.

How can I test my trained YOLOv8-seg model?

After training YOLOv8, you can test your model by running it on a sample image and comparing the results to the ground truth. Specify the path to your custom model and use the ‘infer_yolo_v8_seg’ task to perform instance segmentation on new images.

How long does it take to train a YOLOv8 instance segmentation model?

The time to train YOLOv8 can vary based on hardware, dataset size, number of epoch and pre-trained model size. For example, training YOLOv8m for 50 epochs on an NVIDIA GeForce RTX 3060 Laptop GPU (6GB) took approximately 1 hour.

References

[1] https://github.com/ultralytics/ultralytics

[2] https://github.com/ultralytics/ultralytics/issues/189

[3] Coral Dataset

[4] How to install a virtual environment

Arrow
Arrow
No items found.