Official implementation for: Blended Diffusion for Text-driven Editing of Natural Images.

Last update: Dec 30, 2022

Related tags

Overview

Blended Diffusion for Text-driven Editing of Natural Images

Blended Diffusion for Text-driven Editing of Natural Images
Omri Avrahami, Dani Lischinski, Ohad Fried

Abstract: Natural language offers a highly intuitive interface for image editing. In this paper, we introduce the first solution for performing local (region-based) edits in generic natural images, based on a natural language description along with an ROI mask. We achieve our goal by leveraging and combining a pretrained language-image model (CLIP), to steer the edit towards a user-provided text prompt, with a denoising diffusion probabilistic model (DDPM) to generate natural-looking results. To seamlessly fuse the edited region with the unchanged parts of the image, we spatially blend noised versions of the input image with the local text-guided diffusion latent at a progression of noise levels. In addition, we show that adding augmentations to the diffusion process mitigates adversarial results. We compare against several baselines and related methods, both qualitatively and quantitatively, and show that our method outperforms these solutions in terms of overall realism, ability to preserve the background and matching the text. Finally, we show several text-driven editing applications, including adding a new object to an image, removing/replacing/altering existing objects, background replacement, and image extrapolation.

Applications

Multiple synthesis results for the same prompt

Synthesis results for different prompts

Altering part of an existing object

Background replacement

Scribble-guided editing

Text-guided extrapolation

Composing several applications

Code availability

Full code will be released soon.

Official implementation for: Blended Diffusion for Text-driven Editing of Natural Images.

Related tags

Overview

Blended Diffusion for Text-driven Editing of Natural Images

Applications

Multiple synthesis results for the same prompt

Synthesis results for different prompts

Altering part of an existing object

Background replacement

Scribble-guided editing

Text-guided extrapolation

Composing several applications

Code availability

Owner

[NeurIPS'21] Projected GANs Converge Faster

Pytorch implementation of ICASSP 2022 paper Attention Probe: Vision Transformer Distillation in the Wild

Official code for article "Expression is enough: Improving traﬀic signal control with advanced traﬀic state representation"

Used to record WKU's utility bills on a regular basis.

TransGAN: Two Transformers Can Make One Strong GAN

This is an official implementation for "Video Swin Transformers".

Complete U-net Implementation with keras

Code for CVPR 2021 oral paper "Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts"

PyTorch implementation for paper Neural Marching Cubes.

Anchor-free Oriented Proposal Generator for Object Detection

Repository of continual learning papers

Model search is a framework that implements AutoML algorithms for model architecture search at scale

Does Pretraining for Summarization Reuqire Knowledge Transfer?

Official Code Release for "CLIP-Adapter: Better Vision-Language Models with Feature Adapters"

MSG-Transformer: Exchanging Local Spatial Information by Manipulating Messenger Tokens

YOLOX-Paddle - A reproduction of YOLOX by PaddlePaddle

Learning embeddings for classification, retrieval and ranking.

Developing your First ML Workflow of the AWS Machine Learning Engineer Nanodegree Program

This repository contains numerical implementation for the paper Intertemporal Pricing under Reference Effects: Integrating Reference Effects and Consumer Heterogeneity.

Amazon Forest Computer Vision: Satellite Image tagging code using PyTorch / Keras with lots of PyTorch tricks