Efficient Deployment of Computer Vision Models on AWS with Ikomia SCALE

Allan Kouidri
-
5/28/2024
Illustration robot overcoming hurdle deployment

With the surge in machine learning applications, one big question pops up: Are they actually adding value? The answer lies in their real-life impact, which requires successful deployments and smooth production operations to show their true worth.

Here's a reality check: Many ML projects don't deliver as promised. Data scientists, the folks behind these models, reveal the hard truth. A recent survey conducted last year by Rexer Analytics shows only 22% of "revolutionary" models get deployed. Shocked? There's more. Across all ML projects, just 32% see the light of day [1].

So, why are about 70% of ML models developed never deployed into production environments? The reasons include:

    1. Complexity of Deployment: Deploying Computer Vision (CV) models involves ensuring compatibility across various hardware and software environments and addressing real-time performance issues. Developers face dependency problems when transitioning from local to cloud platforms, ensuring all libraries, frameworks, and tools work consistently. Additionally, optimizing models for specific hardware, like GPUs, adds to the complexity​.

    2. Resource Constraints: Often, as computer vision developers, we don’t have a DevOps team to help out. Many developers lack the necessary infrastructure or expertise to effectively deploy and maintain CV models, particularly when working with edge devices or in cloud environments.

These factors contribute to the significant drop-off between the number of ML models developed and those successfully deployed​.

We understand the challenges and hurdles of deploying computer vision models, particularly the complexity and resource constraints. To address these issues, we've introduced Ikomia SCALE, a streamlined platform designed to simplify the deployment process for Computer Vision models.

Why Ikomia SCALE?

Ikomia SCALE eliminates the need for extensive coding and adaptation across different devices. Whether deploying on a GPU instance on AWS or another provider, SCALE ensures a smoother transition. Here’s how:

      1. Workflow Creation: Create a workflow using SDXL with just a few lines of code using the Ikomia API. 

      2. Deployment: Deploy your model seamlessly on an AWS GPU instance (e.g., T4 or A10) on SCALE. The deployment steps are consistent across various platforms such as Google Cloud Platform or Scaleway, regardless of whether you're using CPU or GPU infrastructure.

     3. Accessing Your REST API Endpoint: Easily access your REST API endpoint, which allows for efficient communication between your applications and the deployed model using standard HTTP methods.

For a detailed comprehensive guide on leveraging the Ikomia ecosystem, visit the official Ikomia SCALE page.

Step 1: Creating your workflow - SDXL + refiner

To illustrate the power of the Ikomia ecosystem, let's consider a scenario where we need to deploy an open-source diffusion model using SDXL with a refiner on an NVIDIA A10 AWS instance (24GB VRAM). Here’s how I streamlined this process:

1. Installing the Ikomia CLI:

  • Setup virtual environment: First, create a virtual environment to keep your dependencies organized [2].
  • Install Ikomia CLI: Install the Ikomia Command Line Interface (CLI) in your virtual environment.

pip install ikomia-cli[full]

The CLI is an essential tool for efficiently managing your projects, algorithms, and deployments. It significantly simplifies the workflow and provides the most convenient way to push your projects to the SCALE platform.

2. Create, test and export your workflow

Let's run the SDXL algorithm and save it as a .JSON file which will outline the structure of the workflow, including the output type and the parameters used.


from ikomia.dataprocess.workflow import Workflow
from ikomia.utils.displayIO import display


# Init your workflow
wf = Workflow()

# Add the stable diffusion algorithm
algo = wf.add_task(name = "infer_hf_stable_diffusion", auto_connect=False)

algo.set_parameters({
    'model_name': 'stabilityai/stable-diffusion-xl-base-1.0',
    'prompt': 'galaxy environment, Capturing A whimsical, a Miniature Schnauzer, winter spring wind rainbow a sprinkle of edible glitter in an dream magical background, trippy, 8k, vivid, ultra details, colorfull lighting, surreal photography, portrait',
    'guidance_scale': '7.5',
    'negative_prompt': 'low resolution',
    'num_inference_steps': '50',
    'width': '768',
    'height': '1024',
    'seed': '1981651',
    'use_refiner': 'False'
})

# Run
wf.run()

# Display the image
display(algo.get_output(0).get_image())

# Save the workflow as a JSON file in your current folder
wf.save("./sdxl_workflow.json")

SDXL generated imageL dog - glitters

Step 2: Push your workflow

Before deploying, you need to push your workflow to Ikomia SCALE. Here’s how to get started:

Pre-requisites

Begin by either creating an Ikomia account or logging in. You can sign up using your email, Google, or GitHub account.

1. Create a Project

In Ikomia SCALE, a project acts as a central hub for your workflows, helping you organize them effectively. To create a new project, follow these steps:

1. Go to the Dashboard

2. Click on "New Project" button to start creating your project.

3. Fill out the card: Complete the form with the following information:

  • Workspace: Select the workspace where you want to store your project. Choose your personal workspace if the project is for personal use.
  • Project Name: Provide a descriptive name for your project that reflects its purpose.
Ikomia SCALE create project SDXL

Generate your access tokens

Option 1: Token Generation Using Ikomia CLI

To authenticate with Ikomia SCALE, you need to generate an access token. Use the following command, replacing < your_login > and < your_password > with your Ikomia SCALE credentials:

To authenticate with Ikomia SCALE, you need to generate an access token. Follow these steps:


ikcli login --token-ttl "< token_duration_in_seconds >" --username "< your_login >" --password "< your_password >"

Option 2: Token Generation from the SCALE Platform

1. Access Token Management: Navigate to the top right corner of the SCALE platform, then go to the settings or account section where you will find API token management.

2. Create New Token: Click on the option to create a new token. Fill in the required details such as the token's name and expiration date.

3. Generate and Save the Token.

4. Copy the Token: Once generated, copy the token. You are now ready to set the token as an environment variable.

Then we can store the token as an environment variable to ensure secure and accessible sessions with Ikomia SCALE.

  • On Windows (or notebooks)

%env IKOMIA_TOKEN= < your_token >

  • On Linux

export IKOMIA_TOKEN= < your_token >

By setting the token as an environment variable, you ensure it is securely stored and easily accessible for your Ikomia SCALE sessions.

Upload Your Workflow to Ikomia SCALE

Once you've set your access token, you’re ready to upload your workflow to the Ikomia SCALE platform. Use the following command to push your workflow:


ikcli project push < project_name > sdxl_workflow.json

Make sure to replace < project_name > with the actual name of your project on Ikomia SCALE, like ‘stable_diffusion’ in this example. This command will upload your workflow JSON file directly to the designated project in SCALE, making it available for management and deployment as required.

Step 3: Deploy you workflow

With your workflow now integrated into your project on Ikomia SCALE, the next step is to deploy it and create a live endpoint for practical use.

    1. Access Your Project: Go to your project page on Ikomia SCALE. Here, you’ll find an overview of all the workflows included in your project.

    2. Deploy Your Workflow:

  • Select Workflow: From the list, choose the workflow you wish to deploy.
  • Choose Deployment Environment: Ikomia SCALE offers flexibility in selecting from various cloud providers and regions, enabling you to deploy your workflow in the most suitable environment.
Ikomia SCALE stable diffusion project

Ikomia SCALE provides three primary types of compute infrastructures to cater to different computational and budgetary needs:

  • Serverless: Ideal for workloads with irregular usage patterns, this CPU-only environment is cost-effective because you are only billed for the execution time of your workflow. For this use case, the serverless option is the best fit.
  • CPU Instances (Coming Soon): These are dedicated CPU-only instances billed based on actual usage time, calculated to the second. This upcoming option is perfect for those needing consistent CPU resources.
  • GPU Instances (Coming Soon): For more intensive computational tasks, such as our current use case, dedicated instances with GPU acceleration are ideal and are also billed per second.

By following these steps, you can deploy your workflow on Ikomia SCALE, making it available for real-time application and management.

Ikomia SCALE AWS GPU provider selection

1. Create a Deployment: On the right-hand side of the project interface, locate the deployment settings where you can configure and initiate the deployment of your selected workflow.

2. Select the Provider: Choose your preferred cloud provider, such as AWS, Google Cloud, or Scaleway, based on their capabilities and your specific requirements.

3. Choose Deployment Type: Select the deployment type that suits your needs:

  • Serverless: Ideal for flexible billing in CPU-only environments.
  • CPU Instance: For dedicated CPU power (coming soon).
  • GPU Instance: For intensive compute tasks (coming soon).

4. Pick a Region: Select a geographical region that aligns with your latency needs and data residency requirements. This optimizes performance and ensures compliance with local data laws.

5. Determine the Size: Based on your computational and memory needs, choose the appropriate size for your deployment, ranging from XS (Nvidia T4) to XL (Nvidia A10).

6. Launch Your Deployment: Click the 'Add deployment' button to initiate the deployment process of your workflow.

Once initiated, the new deployment will appear on the left-hand side of the page under the deployments section. The time it takes for your deployment to become operational can vary depending on the complexity of the workflow and the specifics of your chosen configuration.

SDXL deployed and running - Ikomia SCALE

For this deployment, I selected the NVIDIA A10 24GB GPU configuration, opting for size M. The setup process took about 20 minutes to complete. Once the setup was finished, the system was fully operational, and the deployment ran smoothly.

Step 4: Test your deployment 

SCALE offers a convenient Test Interface to verify that your deployed workflow is functioning as expected.

1. Accessing the Test Interface: To access the Test Interface, navigate to the page of your deployed workflow and click on the ‘Test me’ button associated with your deployment. You can also access the Test Interface by directly visiting the endpoint’s URL.

2. Executing Your Workflow: Here’s how to test your workflow:

  1. Upload an Image: You have the option to upload an image file directly from your computer. This allows you to see how your workflow processes your specific images.
  2. Choose a Sample Image: SCALE provides a variety of sample images for testing. These are useful for quickly checking how your workflow handles different types of content without having to upload your own.

Once you've run the workflow, the Test Interface will display the results, allowing you to evaluate the performance and output of your workflow directly.

Ikomia SCALE test interface

Prompt: ‘galaxy environment, Capturing A whimsical, a Miniature Schnauzer, winter spring wind rainbow a sprinkle of edible glitter in an dream magical background, trippy, 8k, vivid, ultra details, colorfull lighting, surreal photography, portrait’

The test interface displays the generated image, this setup provides a straightforward way to verify that your workflow is functioning correctly, ensuring it is ready for integration with your systems. 

Step 5: Integrating a Deployment

Once you have successfully deployed your workflow on Ikomia SCALE, you will receive a REST API endpoint for each deployment. This endpoint is essential for integrating your computer vision model into your existing applications or systems, enabling them to send images to the model and retrieve the processed results seamlessly.

To make the most of these REST API endpoints, follow these steps:

  1. Integrate the Endpoint: Incorporate this endpoint into your application, allowing it to communicate with the deployed model. Your application can send images to this endpoint and receive processed results in return.
  1. Utilize Documentation: For detailed guidance on using these REST API endpoints effectively, refer to our comprehensive documentation. This resource provides step-by-step instructions on integration techniques and best practices to ensure smooth and efficient incorporation of your model into your workflow.
  1. Maximize Functionality: By following the documentation, you can maximize the functionality and impact of your deployment, ensuring that your applications leverage the full potential of the deployed model.

You can find this valuable resource here. This will help you streamline the integration process and enhance the performance of your computer vision applications.

References

[1] https://drive.google.com/file/d/1Mz3WmtcvUl-00gaT2XKCxdE5-pqbOOjz/view

[2] https://www.ikomia.ai/blog/a-step-by-step-guide-to-creating-virtual-environments-in-python

Arrow
Arrow
No items found.
#API

Build with Python API

#STUDIO

Create with STUDIO app

#SCALE

Deploy with SCALE