This blog post will provide a comprehensive overview of Detectron2, highlighting its key features and advantages. We'll guide you through the installation and the practical usage of Detectron2. We'll address common challenges such as installation issues, compatibility concerns, and algorithmic intricacies.
The deep learning landscape is enriched with numerous tools and libraries made to simplify complex tasks. In the domain of Computer Vision, object detection has been one of the tasks that has attracted a lot of attention. With the release of Detectron2, Facebook AI Research (FAIR) took the challenge head-on, offering a cutting-edge platform for this purpose.
Detectron2 is an open-source project from Facebook AI Research (FAIR) and represents the second version of the Detectron library. Unlike its predecessor, Detectron2 is written in PyTorch, one of the most popular deep learning libraries. This transition provides developers and researchers with greater flexibility, extensibility, and ease of use.
1. Modular and flexible design: Detectron2 is built with modularity in mind. This allows researchers and developers to easily plug in new components or tweak existing ones without much hassle.
2. Extensive model zoo: It comes with a plethora of pre-trained models. Whether you are looking to implement instance segmentation, panoptic segmentation, or plain object detection, Detectron2 has a pre-trained model available.
3. Native PyTorch implementation: Unlike its predecessor, which was built on Caffe2, Detectron2 leverages the capabilities of PyTorch, making it much easier to use and integrate with other PyTorch-based tools.
4. Training and evaluation utilities: Detectron2 provides out-of-the-box functionalities that streamline the process of training, evaluating, and fine-tuning models.
Detectron2 provides a wide range of models in its model zoo, each tailored for specific computer vision tasks. Here's a breakdown of the main models Detectron2 offers for different tasks:
DeepLabv3+: An encoder-decoder structure-based model that is known for great performance in semantic segmentation tasks, leveraging atrous convolutions and fully connected spatial pyramid pooling.
For this section, we will navigate through the Detectron2 documentation for instance segmentation. Before jumping in, we recommend that you review the entire process as we encountered some steps that were problematic. Browsing through our various attempts you will save time and energy.
Detectron2 suggests specific OS and Python and PyTorch versions for optimal results:
That said, we are venturing forward on a Windows machine, and for all the Windows users reading this, let's make it happen!
Setting up a working environment begins with creating a Python virtual environment and then installing the Torch dependencies.
Python 3.7 or higher is suggested, we are opting for Python 3.10:
If you're new to virtual environments, here's a comprehensive guide to help you set up one.
After activating the ‘venvdetectron2' virtual environment, we proceed to install the PyTorch and OpenCV dependencies:
Now it’s time to build detectron2 from source.
Following the official recommendation, we initially tried installing Detectron2 directly from its Git repository:
Unfortunately, I was met with an error message.
We proceeded to try a different approach by cloning the repository locally and then installing:
This time, the compilation error stack was so extensive that the command prompt interface wouldn't even display the beginning of it.
It was disappointing to discover that the Detectron2 documentation did not provide any further installation alternatives except dockers which are ephemeral and complex to set up. Additionally, the 'common installation issues section' didn't address the specific error I encountered.
When facing issues with a particular repository, the 'issues' section is typically a reliable resource for potential solutions.
At the time of writing this post, there were several open issues related to support for Windows users. Unfortunately, the lack of response from the facebookresearch Detectron2 team suggests that Windows support may not be forthcoming.
Considering the 2023 Stack Overflow survey indicates Windows remains the dominant operating system for developers (both in personal and professional spheres), the absence of Windows support is indeed perplexing.
Given the lack of Windows support, we forked and edited the repository for correct compilation across the different operating systems.
We selected the ‘mask_rcnn_R_50_FPN_3x’ model and its corresponding config file from the model zoo. To demonstrate the built-in configurations, we utilized the ‘demo.py’ provided. Note that ‘demo.py’ can be found in the ‘detectron2/demo’ directory.
Navigating the Detectron2 setup proved to be a time-consuming challenge, taking over an hour and significant efforts to successfully implement.
Although the demo was executed seamlessly, identifying the right combination of model weights and configuration files for more extensive testing is less than intuitive.
In the following section, we'll demonstrate how to simplify the installation and usage of Detectron2 via the Ikomia API, significantly reducing both the steps and time needed to execute object detection tasks.
With the Ikomia team, we've been working on a prototyping tool to avoid dependencies and compatibility issues, thereby speeding up the often tedious processes of installation and testing.
We wrapped it in an open source Python API. Now we're going to explain how to use all the Detectron2 models in less than 5 minutes.
If you have any questions, please join our Discord.
As usual, we will use a virtual environment.
Then the only thing you need to install is Ikomia API:
You can also charge directly the notebook we have prepared.
To carry out instance segmentation, we simply installed Ikomia and ran the workflow code snippets. All dependencies were seamlessly handled in the background. We progressed from setting up a virtual environment to obtaining results in approximately 5 minutes.
We've implemented all the Detectron2 algorithms for both inference and training. You can conveniently find code snippets tailored to your needs on the Ikomia HUB.
Real-world object detection applications frequently necessitate fine-tuning your model and integrating it with other models, such as object tracking.
One of the standout benefits of the API, aside from simplifying dependency installations, is its innate ability to seamlessly interlink algorithms from diverse frameworks, including frameworks like Detectron2, OpenMMLab, YOLO, Hugging Face.
Once you have crafted your solution with Ikomia's Python API, you can deploy it yourself, or opt for SCALE, our automated deployment SaaS platform.