In this blog post, we will outline the essential steps for achieving real-time video object detection using the Ikomia API alongside your webcam.
The Ikomia API enables you to utilize a ready-to-use detection model for real-time video object detection in a video stream captured from your camera. To begin, you'll need to install the API within a virtual environment.
How to install a virtual environment
Alternatively, you can directly access the open-source notebook that we have prepared.
Camera stream processing involves the real-time analysis and manipulation of images and video streams captured from a camera. This technique finds widespread application in diverse fields such as Computer Vision, surveillance, robotics, and entertainment.
In Computer Vision, camera stream processing plays a pivotal role in tasks like object detection and recognition, face detection, motion tracking, and image segmentation.
Camera stream processing assumes a critical role across various domains, enabling the realization of numerous exciting applications that were once considered unattainable.
To embark on camera stream processing, we will make use of OpenCV and VideoCapture with the YOLOv7 algorithm.
Here are the detailed steps followed in the first code snippet with all parameters explained.
Initialize a video capture object to retrieve frames from a camera device. Use the following code:
The parameter `0` passed to VideoCapture indicates that you want to capture video from the default camera device connected to your system. If you have multiple cameras connected, you can specify a different index to capture video from a specific camera (e.g., `1` for the second camera), or you can give the path to a video.
We initialize a workflow instance using the following code:
The ‘wf’ object can then be used to add tasks to the workflow instance, configure their parameters, and run them on input data.
By default, OpenCV uses the BGR color format, whereas Ikomia works with RGB images. To display the image output with the right colors, we need to flip the blue and red planes.
Add the ‘infer_yolo_v7’ task, setting the pre-trained model and the confidence threshold parameter using the following code:
We read the frames from a video stream using a continuous loop. If there is an issue reading a frame, it skips to the next iteration.
It then runs the workflow on the current frame and displays the results using OpenCV. The displayed image includes graphics generated by the "YOLO" object detection system.
The displayed window allows the user to quit the streaming process by pressing the 'q' key. If the 'q' key is pressed, the loop is broken, and the streaming process ends.
After the loop, release the stream object and destroy all windows created by OpenCV.
By leveraging Ikomia API, developers can streamline the creation of Computer Vision workflows and explore various parameters to attain the best possible outcomes.
For additional insights into the API, we recommend referring to the comprehensive documentation. Additionally, you can explore the selection of cutting-edge algorithms available on Ikomia HUB and experiment with Ikomia STUDIO, a user-friendly interface that encompasses the same functionality as the API. Take advantage of these resources to further enhance your Computer Vision endeavors.