With the surge in machine learning applications, one big question pops up: Are they actually adding value? The answer lies in their real-life impact, which requires successful deployments and smooth production operations to show their true worth.
Here's a reality check: Many ML projects don't deliver as promised. Data scientists, the folks behind these models, reveal the hard truth. A recent survey conducted last year by Rexer Analytics shows only 22% of "revolutionary" models get deployed. Shocked? There's more. Across all ML projects, just 32% see the light of day [1].
So, why are about 70% of ML models developed never deployed into production environments? The reasons include:
1. Complexity of Deployment: Deploying Computer Vision (CV) models involves ensuring compatibility across various hardware and software environments and addressing real-time performance issues. Developers face dependency problems when transitioning from local to cloud platforms, ensuring all libraries, frameworks, and tools work consistently. Additionally, optimizing models for specific hardware, like GPUs, adds to the complexity.
2. Resource Constraints: Often, as computer vision developers, we don’t have a DevOps team to help out. Many developers lack the necessary infrastructure or expertise to effectively deploy and maintain CV models, particularly when working with edge devices or in cloud environments.
These factors contribute to the significant drop-off between the number of ML models developed and those successfully deployed.
We understand the challenges and hurdles of deploying computer vision models, particularly the complexity and resource constraints. To address these issues, we've introduced Ikomia SCALE, a streamlined platform designed to simplify the deployment process for Computer Vision models.
Ikomia SCALE eliminates the need for extensive coding and adaptation across different devices. Whether deploying on a GPU instance on AWS or another provider, SCALE ensures a smoother transition. Here’s how:
1. Workflow Creation: Create a workflow using SDXL with just a few lines of code using the Ikomia API.
2. Deployment: Deploy your model seamlessly on an AWS GPU instance (e.g., T4 or A10) on SCALE. The deployment steps are consistent across various platforms such as Google Cloud Platform or Scaleway, regardless of whether you're using CPU or GPU infrastructure.
3. Accessing Your REST API Endpoint: Easily access your REST API endpoint, which allows for efficient communication between your applications and the deployed model using standard HTTP methods.
For a detailed comprehensive guide on leveraging the Ikomia ecosystem, visit the official Ikomia SCALE page.
To illustrate the power of the Ikomia ecosystem, let's consider a scenario where we need to deploy an open-source diffusion model using SDXL with a refiner on an NVIDIA A10 AWS instance (24GB VRAM). Here’s how I streamlined this process:
The CLI is an essential tool for efficiently managing your projects, algorithms, and deployments. It significantly simplifies the workflow and provides the most convenient way to push your projects to the SCALE platform.
Let's run the SDXL algorithm and save it as a .JSON file which will outline the structure of the workflow, including the output type and the parameters used.
Before deploying, you need to push your workflow to Ikomia SCALE. Here’s how to get started:
Begin by either creating an Ikomia account or logging in. You can sign up using your email, Google, or GitHub account.
In Ikomia SCALE, a project acts as a central hub for your workflows, helping you organize them effectively. To create a new project, follow these steps:
1. Go to the Dashboard
2. Click on "New Project" button to start creating your project.
3. Fill out the card: Complete the form with the following information:
Option 1: Token Generation Using Ikomia CLI
To authenticate with Ikomia SCALE, you need to generate an access token. Use the following command, replacing < your_login > and < your_password > with your Ikomia SCALE credentials:
To authenticate with Ikomia SCALE, you need to generate an access token. Follow these steps:
Option 2: Token Generation from the SCALE Platform
1. Access Token Management: Navigate to the top right corner of the SCALE platform, then go to the settings or account section where you will find API token management.
2. Create New Token: Click on the option to create a new token. Fill in the required details such as the token's name and expiration date.
3. Generate and Save the Token.
4. Copy the Token: Once generated, copy the token. You are now ready to set the token as an environment variable.
Then we can store the token as an environment variable to ensure secure and accessible sessions with Ikomia SCALE.
By setting the token as an environment variable, you ensure it is securely stored and easily accessible for your Ikomia SCALE sessions.
Once you've set your access token, you’re ready to upload your workflow to the Ikomia SCALE platform. Use the following command to push your workflow:
Make sure to replace < project_name > with the actual name of your project on Ikomia SCALE, like ‘stable_diffusion’ in this example. This command will upload your workflow JSON file directly to the designated project in SCALE, making it available for management and deployment as required.
With your workflow now integrated into your project on Ikomia SCALE, the next step is to deploy it and create a live endpoint for practical use.
1. Access Your Project: Go to your project page on Ikomia SCALE. Here, you’ll find an overview of all the workflows included in your project.
2. Deploy Your Workflow:
Ikomia SCALE provides three primary types of compute infrastructures to cater to different computational and budgetary needs:
By following these steps, you can deploy your workflow on Ikomia SCALE, making it available for real-time application and management.
1. Create a Deployment: On the right-hand side of the project interface, locate the deployment settings where you can configure and initiate the deployment of your selected workflow.
2. Select the Provider: Choose your preferred cloud provider, such as AWS, Google Cloud, or Scaleway, based on their capabilities and your specific requirements.
3. Choose Deployment Type: Select the deployment type that suits your needs:
4. Pick a Region: Select a geographical region that aligns with your latency needs and data residency requirements. This optimizes performance and ensures compliance with local data laws.
5. Determine the Size: Based on your computational and memory needs, choose the appropriate size for your deployment, ranging from XS (Nvidia T4) to XL (Nvidia A10).
6. Launch Your Deployment: Click the 'Add deployment' button to initiate the deployment process of your workflow.
Once initiated, the new deployment will appear on the left-hand side of the page under the deployments section. The time it takes for your deployment to become operational can vary depending on the complexity of the workflow and the specifics of your chosen configuration.
For this deployment, I selected the NVIDIA A10 24GB GPU configuration, opting for size M. The setup process took about 20 minutes to complete. Once the setup was finished, the system was fully operational, and the deployment ran smoothly.
SCALE offers a convenient Test Interface to verify that your deployed workflow is functioning as expected.
1. Accessing the Test Interface: To access the Test Interface, navigate to the page of your deployed workflow and click on the ‘Test me’ button associated with your deployment. You can also access the Test Interface by directly visiting the endpoint’s URL.
2. Executing Your Workflow: Here’s how to test your workflow:
Once you've run the workflow, the Test Interface will display the results, allowing you to evaluate the performance and output of your workflow directly.
Prompt: ‘galaxy environment, Capturing A whimsical, a Miniature Schnauzer, winter spring wind rainbow a sprinkle of edible glitter in an dream magical background, trippy, 8k, vivid, ultra details, colorfull lighting, surreal photography, portrait’
The test interface displays the generated image, this setup provides a straightforward way to verify that your workflow is functioning correctly, ensuring it is ready for integration with your systems.
Once you have successfully deployed your workflow on Ikomia SCALE, you will receive a REST API endpoint for each deployment. This endpoint is essential for integrating your computer vision model into your existing applications or systems, enabling them to send images to the model and retrieve the processed results seamlessly.
To make the most of these REST API endpoints, follow these steps:
You can find this valuable resource here. This will help you streamline the integration process and enhance the performance of your computer vision applications.
[1] https://drive.google.com/file/d/1Mz3WmtcvUl-00gaT2XKCxdE5-pqbOOjz/view
[2] https://www.ikomia.ai/blog/a-step-by-step-guide-to-creating-virtual-environments-in-python