Privacy-Preserving Portrait Matting: Secure Image Editing

In today's world, where images are a huge part of how we interact online, there's a growing need to keep personal information safe while processing these images. This is where P3M, Privacy-Preserving Portrait Matting, comes into play, offering a smart solution that combines advanced deep learning techniques with the crucial need for privacy.

‍

P3M portrait matting is all about doing something quite complicated—separating a person in a photo from the background, a process known as matting, without giving away who they are. It's a clever way of making sure people can enjoy the benefits of modern image editing without worrying about their privacy being compromised.

‍

What is P3M Portrait Matting?

P3M, Privacy-Preserving Portrait Matting, is an innovative approach that combines the power of deep learning with the necessity of protecting individuals' privacy in digital images. It specifically addresses the challenge of separating a portrait subject from its background (matting) without compromising the individual's identity.

‍

Privacy-Preserving Methods

Privacy-preserving methods in image processing and computer vision aim to anonymize personal information in images, such as faces, without degrading the quality of the task at hand, such as matting. These methods involve techniques like blurring, pixelation, or generating synthetic data that maintains the utility for specific tasks while ensuring privacy.

‍

The Architecture of P3M

The architecture of P3M portrait matting integrates a novel approach through its unique multi-task framework, leveraging the synergy between a segmentation decoder and a matting decoder, anchored by a shared encoder. This structure is pivotal in achieving high-quality portrait matting results without compromising individual privacy.

‍

Let's delve into the detailed components and functionalities that make up the architecture of P3M portrait matting:

‍

Multi-task Framework

The foundation of P3M portrait matting's architecture is a multi-task framework that efficiently balances the intricacies of semantic segmentation and detail matting. This framework is instrumental in processing privacy-preserved images, such as those with obfuscated faces, ensuring that the privacy constraints do not impede the matting quality.

‍

The multi-task nature allows for simultaneous learning of global image features and specific task-relevant features, optimizing the model's performance on both fronts.

‍

Overview of P3M portrait matting multitask framework. [1]

‍

Shared Encoder

At the heart of P3M portrait matting architecture lies the shared encoder, a modified version of ResNet-34 equipped with max pooling layers. This lightweight backbone is selected for its efficiency and effectiveness in capturing base visual features from the input images. The encoder serves as the foundational layer from which semantic and detail features are derived, ensuring that the essential visual cues are preserved even in the face of privacy-enhanced inputs.

‍

Decoders for Segmentation and Matting

P3M portrait matting employs two dedicated decoders: one for semantic segmentation and another for matting. Each decoder is structured with five blocks, comprising three convolution layers each, tailored to their specific tasks. The segmentation decoder utilizes bilinear interpolation for upsampling, focusing on the broader semantic understanding of the image.

‍

Conversely, the matting decoder employs a max unpooling operation, leveraging the indices from the encoder to refine the detail matting with precision, capturing the fine details necessary for high-quality matting output.

‍

Tripartite-Feature Integration (TFI)

The TFI module is a cornerstone innovation in P3M portrait matting architecture, designed to integrate features from three critical sources: the previous block of the matting decoder, the corresponding block of the segmentation decoder, and the symmetrical block from the encoder.

‍

This integration facilitates a comprehensive understanding of the image, ensuring that both global semantics and fine details are considered in the matting process. The TFI module exemplifies the model's ability to maintain the integrity of the matting results while adhering to privacy constraints.

P3M architecture: TFI Tripartite-Feature Integration — [1]

‍

Bipartite-Feature Integration Modules

Complementing the TFI, P3M portrait matting introduces two bipartite-feature integration modules: the Deep Bipartite-Feature Integration (dBFI) and the Shallow Bipartite-Feature Integration (sBFI). These modules are tasked with leveraging deep and shallow features, respectively.

P3M architecture: Bipartite-Feature Integration — [1]

‍

The dBFI focuses on integrating high-level semantic features with the encoder's output to enhance the segmentation decoder.

‍

In contrast, the sBFI module uses finer details from the encoder to improve the precision of the matting decoder. These modules are pivotal in ensuring that P3M portrait matting achieves a delicate balance between preserving privacy and delivering exceptional matting quality.

‍

Privacy-preserving Dataset

The P3M-10k dataset, introduced as part of the study, is the first large-scale anonymized dataset for portrait matting designed with privacy preservation at its core.

‍

This publicly available dataset consists of 10,000 high-resolution portrait images with face obfuscation to protect privacy, alongside high-quality ground truth alpha mattes. The dataset was created to enable the development and evaluation of matting techniques that respect user privacy.

‍

For evaluation purposes, the dataset was divided into two test sets:

‍

P3M-500-P: A set of 500 face-blurred images from the P3M-10k dataset, used to validate the performance of matting models under privacy-preserving conditions.

‍

P3M-500-NP: An additional set of 500 public celebrity images collected from the internet without face obfuscation, aimed at evaluating the models' performance on normal (non-privacy-preserving) portrait images.

‍

Easily run P3M portrait matting for background removal

With the Ikomia API, you can effortlessly remove background on your image in just a few lines of code.

‍

Setup

To get started, you need to install the API in a virtual environment [2].


pip install ikomia

‍

Run P3M portrait matting with a few lines of code

‍

You can also directly charge the notebook we have prepared.

Go to notebook

Go to Colab


from ikomia.dataprocess.workflow import Workflow
from ikomia.utils import ik
from ikomia.utils.displayIO import display


# Init your workflow
wf = Workflow()    

# Add the p3m process to the workflow
p3m = wf.add_task(ik.infer_p3m_portrait_matting(
                model_name="resnet34",
                input_size="1024",
                method='HYBRID',
                cuda="True"), auto_connect=True)

# Run workflow on the image
wf.run_on(url="https://github.com/Ikomia-dev/notebooks/blob/main/examples/img/img_portrait_4.jpg?raw=true")

# Inspect your results
display(p3m.get_input(0).get_image()) 
display(p3m.get_output(0).get_image())
display(p3m.get_output(1).get_image())

List of parameters:

‍

model_name (str) - default 'resnet34': Name of the model, resnet34 or vitae-s
input_size (int) - default: '1024': Size of the input image (stride of 32)
method (str) - default: 'HYBRID': Choice of the inference method 'HYBRID' or 'RESIZE'
cuda (bool): If True, CUDA-based inference (GPU). If False, run on CPU.

‍

P3M portrait matting — Original image source. [3]

‍

Explore further with the Ikomia API.

‍

Throughout this article, we have explored the complexities of background removal using P3M portrait matting techniques. However, our journey doesn't stop here. The Ikomia platform expands our possibilities by offering a diverse collection of image matting algorithms, notably featuring the cutting-edge MODnet.

‍

Ressources

Ikomia HUB showcases algorithms. Offers accessible code snippets for easy experimentation and capability assessment.
Ikomia documentation provides detailed guidance on maximizing the API's potential.
Ikomia STUDIO extends the ecosystem, offering a user-friendly interface with the same functionalities as the API, ideal for those seeking an intuitive, visual approach to image processing.

‍